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SYSTEM FOR THE IN VIVO DELIVERY AND 
EXPRESSION OF HETEROLOGOUS GENES IN 
THE BONE MARROW 

5 FEDERALLY SPONSORED RESEARCH 

This invention was made with Government support under Grant 
Number 5 ROl AI22186 from the National Institutes of Health. The Government 
has certain rights to this invention. 

FIELD OF THE INVENTION 
10 The present invention relates to recombinant DNA technology, and in 

particular to introducing and expressing foreign DNA in a eukaryotic cell. 

BACKGROUND OF THE INVENTION 
The Alphavirus genus includes a variety of viruses all of which are 

members of the Togaviridae family. The alphaviruses include Eastern Equine 
15 Encephalitis virus (EEE), Venezuelan Equine Encephalitis virus (VEE), Everglades 

virus, Mucambo virus, Pixuna virus, Western Equine Encephalitis virus (WEE), 

Sindbis virus, South African Arbovirus No. 86 (S.A.AR 86), Girdwood S.A. 

virus, Ockelbo virus, Semliki Forest virus, Middelburg virus, Chikungunya virus, 

O'Nyong-Nyong virus, Ross River virus, Barmah Forest virus, Getah virus, 
20 Sagiyama virus, Bebaru virus, Mayaro virus, Una virus, Aura virus, Whataroa 

virus, Babanki virus, Kyzylagach virus, Highlands J virus, Fort Morgan virus, 

Ndumu virus, and Buggy Creek virus. 
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The alphavirus genome is a single-stranded, messenger-sense RNA, 
modified at the 5'-end with a methylated cap, and at the 3'-end with a variable- 
length poly (A) tract. The viral genome is divided into two regions: the first 
encodes the nonstructural or replicase proteins (nsPl-nsP4) and the second encodes 
the viral structural proteins. Strauss and Strauss, Microbiological Rev. 58, 491- 
562, 494 (1994). Structural subunits consisting of a single viral protein, C, 
associate with themselves and with the RNA genome in an icosahedral 
nucleocapsid. In the virion, the capsid is surrounded by a lipid envelope covered 
with a regular array of transmembranal protein spikes, each of which consists of 
a heterodimeric complex of two glycoproteins, El and E2. See Paredes et al., 
Proc. Natl. Acad. Sci. USA 90, 9095-99 (1993); Paredes et al., Virology 187, 324- 
32 (1993); Pedersen et al., /. Virol. 14:40 (1974). 

Sindbis virus, the prototype member of the alphavirus genus of the 
family Togaviridae, and viruses related to Sindbis are broadly distributed 
throughout Africa, Europe, Asia, the Indian subcontinent, and Australia, based on 
serological surveys of humans, domestic animals and wild birds. Kokemot et al., 
Trans. R. Soc. TropMed. Hyg. 59, 553-62 (1965); Redaksie, S. Afr. Med. J. 42, 
197 (1968); Adekolu-JohnandFagbami, Trans. R. Soc. Trop. Med. Hyg. 77, 149- 
51 (1983); Darwish et al., Trans. R. Soc. Trop. Med. Hyg. 77, 442-45 (1983); 
Lundstrom et al., Epidemiol. Infect. 106, 567-74 (1991); Morrill et al., /. Trop. 
Med. Hyg. 94, 166-68 (1991). The first isolate of Sindbis virus (strain AR339) 
was recovered from a pool of Culex sp. mosquitoes collected in Sindbis, Egypt in 
1953 (Taylor et al., Am, J. Trop. Med. Hyg. 4, 844-62 (1955)), and is the most 
extensively studied representative of this group. Other members of the Sindbis 
group of alphaviruses include South African Arbovirus No. 86; Ockelbo82, and 
Girdwood S.A. These viruses are not strains of the Sindbis virus; they are related 
to Sindbis AR339, but they are more closely related to each other based on 
nucleotide sequence and serological comparisons. Lundstrbm et al., J. Wildl. Dis. 
29, 189-95 (1993); Simpson et al., Virology 222, 464-69 (1996). Ockelbo82, 
S.A.AR86 and Girdwood S.A. are all associated with human disease, whereas 
Sindbis is not. The clinical symptoms of human infection with Ockelbo82, 
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S.A.AR86, or.Girdwood S.A. arc a febrile illness, general malaise, macropapular 
rash, and joint pain that occasionally progresses to a polyarthralgia sometimes 
lasting from a few months to a few years. 

The study of these viruses has led to the development of beneficial 
5 techniques for vaccinating against the alphavirus diseases, and other diseases 
through the use of alphavirus vectors for the introduction of foreign DNA. See 
United States Patent No. 5,185,440 to Davis et al., and PCT Publication WO 
92/10578. It is intended that all United States patent references be incorporated 
in their entirety by reference. 

10 It is well known that live, attenuated viral vaccines are among the 

most successful means of controlling viral disease. However, for some virus 
pathogens, immunization with a live virus strain may be either impractical or 
unsafe. One alternative strategy is the insertion of sequences encoding immunizing 
antigens of such agents into a vaccine strain of another virus. One such system 

15 utilizing a live VEE vector is described in United States Patent No. 5,505,947 to 
Johnston et al. 

Sindbis virus vaccines have been employed as viral carriers in virus 
constructs which express genes encoding immunizing antigens for other viruses. 
See United States Patent No. 5,217,879 to Huang et al. Huang et al. describes 
2t) Sindbis infectious viral vectors. However, the reference does not describe the 
cDNA sequence of Girdwood S.A. and TR339, nor clones or viral vectors 
produced therefrom. 

Another such system is described by Hahn et al. , Proc. NatL Acad. 
ScL USA 89:2679 (1992), wherein Sindbis virus constructs which express a 
25 truncated form of the influenza hemagglutinin protein are described. The 
constructs are used to study antigen processing and presentation in vitro and in 
mice. Although no infectious challenge dose is tested, it is also suggested that 
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such constructs might be used to produce protective B- and T-cell mediated 
immunity. 

London et al., Proc. Natl. Acad: Sci, USA 89, 207-11 (1992), 
disclose a method of producing an immune response in mice against a lethal Rift 
Valley Fever (RVF) virus by infecting the mice with an infectious Sindbis virus 
containing an RVF epitope. London does not disclose using Girdwood S. A. or 
TR339 to induce an immune response in animals. 

Viral carriers can also be used to introduce and express foreign 
DNA in eukaryotic cells. One goal of such techniques is to employ vectors that 
target expression to particular cells and/or tissues. A current approach has been 
to remove target cells from the body, culture them ex vivo, infect them with an 
expression vector, and then reintroduce them into the patient. 

PCT Publication No. WO 92/10578 to Garoff and Liljestrom 
provide a system for introducing and expressing foreign proteins in animal cells 
using alphaviruses. This reference discloses the use of Semliki Forest virus to 
introduce and express foreign proteins in animal cells. The use of Girdwood S.A. 
or TR339 is not discussed. Furthermore, this reference does not provide a method 
of targeting and introducing foreign DNA into specific cell or tissue types. 

Accordingly, there remains a need in the art for full-length cDNA 
clones of positive-strand RNA viruses, such as Girdwood S.A and TR339. In 
addition, there is an ongoing need in the art for improved vaccination strategies. 
Finally, there remains a need in the art for improved methods and nucleic acid 
sequences for delivering foreign DNA to target cells. 



SUMMAWV nir ttt e INVENTION 
A first aspect of the present invention is a method of introducing 
and expressing heterologous RNA in bone marrow cells, comprising: (a) providing 
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a recombinant alphavirus, the alphavirus containing a heterologous RNA segment, 
the heterologous RNA segment comprising a promoter operable in bone marrow 
cells operatively associated with a heterologous RNA to be expressed in bone 
marrow cells; and then (b) contacting the recombinant alphavirus to the bone 
marrow cells so that the heterologous RNA segment is introduced and expressed 



therein. 



As a second aspect, the present invention provides a helper cell for 
expressing an infectious, propagation defective, Girdwood S.A. virus particle, 
comprising, in a Girdwood S.A.-permissive cell: (a) a fust helper RNA encoding 
(i) at least one Girdwood S.A. structural protein, and 00 not encoding at least one 
other Girdwood S.A. structural protein; and (b) a second helper RNA separate 
from the first helper RNA, the second helper RNA (i) not encoding the at least one 
Girdwood S.A. structural protein encoded by the first helper RNA, and 0i) 
encoding the at least one other Girdwood S.A. structural protein not encoded by 
the first helper RNA, and with all of the Girdwood S.A. structural proteins 
encoded by the first and second helper RNAs assembling together into Girdwood 
S.A. particles in the cell containing the replicdn RNA; and wherein the Girdwood 
S.A. packaging segment is deleted from at least the first helper RNA. 

A third aspect of the present invention is a method of making 
infectious, propagation defective, Girdwood S.A. virus particles, comprising: 
transfecting a Girdwood S.A.-permissive cell with a propagation defective replicon 
RNA, the replicon RNA including the Girdwood S.A. packaging segment and an 
inserted heterologous RNA; producing the Girdwood S.A. virus particles in the 
transfected cell; and then collecting the Girdwood S.A. virus particles from the 
cell. Also disclosed are infectious Girdwood S.A. RNAs, cDNAs encoding the 
same, infectious Girdwood S.A. virus particles, and pharmaceutical formulations 
thereof. 

As a fourth aspect, the present invention provides a helper cell for 
expressing an infectious, propagation defective, TR339 virus particle, comprising, 
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in a TR339-permissive cell: (a) a first helper RNA encoding (i) at least one TR339 
structural protein, and (it) not encoding at least one other 1*339 structural protein; 
and (b) a second helper RNA separate from the first helper RNA, the second 
helper RNA (i) not encoding the at least one TR339 structural protein encoded by 
the first helper RNA, and f") encoding the at least one other TR339 structural 
protein not encoded by the first helper RNA, and with all of the TR339 structural 
proteins encoded by the first and second helper RNAs assembling together into 
TR339 particles in the cell containing the replicon RNA; and wherein the TR339 
packaging segment is deleted from at least the first helper RNA. 



10 



A fifth aspect of the present invention is a method of making 
infectious, propagation defective. TR339 virus particles, comprising: transfecting 
a TR339-pennissive cell with a propagation defective replicon RNA, the replicon 
RNA including the TR339 packaging segment and an inserted heterologous RNA; 
producing the TO339 virus particles in the transfected cell; and then collecting the 
15 TR339 virus particles from the cell. Also disclosed are infectious TR339 RNAs, 
cDNAs encoding the same, infectious TR339 virus particles, and pharmaceutical 
formulations thereof. 



As a sixth aspect, the present invention provides a recombinant 
DNA comprising a cDNA coding for .an infectious Girdwood S.A. virus RNA 
20 transcript, and a heterologous promoter positioned upstream from the cDNA and 
operatively associated therewith. The present invention also provides infectious 
RNA transcripts encoded by the above-mentioned cDNA and infectious viral 
particles containing the infectious RNA transcripts. 



25 



As a seventh aspect, the present invention provides a recombinant 
DNA comprising a cDNA coding for a Sindbis strain TR339 RNA transcript, and 
a heterologous promoter positioned upstream from the cDNA and operatively 
associated therewith. The present invention also provides infectious RNA 
transcripts encoded by the above-mentioned cDNA and infectious viral particles 
containing the infectious RNA transcripts. 
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The foregoing and other aspects of the present invention are 
described in the detailed description set forth below. 

BRIEF DESfTRTPTTO^ OF TTTF. TOtAy|TN^ 
Figure 1 presents the cDNA sequence (SEQ m NO:l) of 
5 S*A.AR86. The RNA sequence of the 5' 40 nucleotides was obtained by direct 
sequencing of the genomic RNA. The rest of the genome was sequenced by RT- 
PCR of fragments amplified from virion RNA. Nucleotides 1 through 59 
represent the 5' UTR, the non-structural polyprotein is encoded by nucleotides 60 
through 7559 (nsPl-nt60 through ntl679; nsP2-ntl680 through nt4099; nsP3- 
10 nt4100 through nt5729; nsP4-nt5730 through nt7559), the structural polyprotein 
is encoded by nucleotides 7608 through 11342 (capsid-nt7608 through nt8399; E3- 
-nt8400 through nt8591; E2-nt8592 through nt9860; 6K-nt9861 through ntl0025; 
El-ntl0026 through ntll342), and the 3' UTR is represented by nucleotides 
11346 through 11663. 

15 Figure 1A shows nucleotides 1 through 3800 of the cDNA sequence 

ofS.A.AR86. 

Figure IB shows nucleotides 3801 through 7900 of the cDNA 
sequence of S.A.AR86. 

Figure 1C shows nucleotides 7901 through 11663 of the cDNA 
20 sequence of S.A.AR86. 

Figure 2 presents the putative amino acid sequences of the 
S.A.AR86 polyproteins (SEQ ID NO:2 and SEQ ID NO:3). The amino acids 
were derived from the S.A.AR86 cDNA sequence given in Figure 1 (SEQ ID 
NO:l). 
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Flgure 2A shows the amino acid sequence of the non-structural 
polyprotein of S.A.AR86 (SEQ ID NO:2). 

Figure 2B shows the amino acid sequence of the structural 
polyprotein of S.A.AR86 (SEQ ID NO:3). 

Figure 3 presents the cDNA sequence (SEQ ID NO:4) of Girdwood 
S.A. The RNA sequence of the 5' 40 nucleotides was obtained by direct 
sequencing of the genomic RNA. The rest of the genome sequence was obtained 
by sequencing of fragments amplified by RT-PCR from virion RNA. An "N" in 
the sequence indicates that the identity of the nucleotide at that position is 
unknown. Nucleotides 1 through 59 represent the 5' UTR, the non-structural 
polyprotein is encoded by nucleotides 60 through 7613 (nsPl-nt60 through 
ntl679; nsP2-ntl680 through nt4099; nsP3-nt4100 through nt5762 or nt5783; 
nsP4-nt5784 through nt7613), the structural polyprotein is encoded by nucleotides 
7662 through 11396 (capsid-nt7662 through nt8453; E3-nt8454 through nt8645; 
E2-nt8646 through nt9914, 6K-9915 through ntl0079; El-ntl0080 through 
ntll396), and the 3' UTR is represented by nucleotides 11400 through 11717. 
There is an opal termination codon at nucleotides 5763 through 5765. 

Figure 3A shows nucleotides 1 through 3800 of the cDNA sequence 
of Girdwood S.A. 

Figure 3B shows nucleotides 3801 through 7900 of the cDNA 
sequence of Girdwood S.A. 

Figure 3C shows nucleotides 7901 through 11717 of the cDNA 
sequence of Girdwood S.A. 

Figure 4 illustrates the putative amino acid sequences of the 
Girdwood S.A. polyproteins (SEQ ID NO:5 and SEQ ID NO:6). The amino 
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acids were derived from the Girdwood S.A. cDNA sequence given in figure 3 
(SEQ ID NO:4). 

Figure 4A shows the amino acid sequence of the non-structural 
polyprotein of Girdwood S.A. The sequence terminates at the opal termination 
codon. The complete amino acid sequence is presented in SEQ ID NO:5. 

Figure 4B shows the amino acid sequence of the structural 
polyprotein of Girdwood S.A. (SEQ ID NO:6). 

Figure 5 illustrates the nucleotide sequence (SEQ ID NO:7) of 
clone pS55, a cDNA clone of the S. A.AR86 genomic RNA. 

Figure 5 A shows nucleotides 1 through 6720 of the cDNA sequence 

ofpS55. 

Figure SB shows nucleotides 6721 through 11663 of the cDNA 
sequence of pS55. 

Figure 6 presents the cDNA sequence (SEQ ID NO:8) of clone 
pTR339. The TR339 virus is derived from this clone. Nucleotides 1 through 59 
represent the 5' UTR, the non-structural polyprotein is encoded by nucleotides 60 
through 7598 (nsPl-nt60 through ntl679; nsP2-ntl680 through nt4099; nsP3- 
nt4100 through nt5747 or 5768; nsP4-nt5769 through nt7598), the structural 
polyprotein is encoded by nucleotides 7647 through 11381 (capsid-nt7647 through 
nt8438; E3-nt8439 through nt8630; E2-nt8631 through nt9899; 6K-nt9900 
through ntl0064; El-ntl0065 through ntll381) f and the 3' UTR is represented 
by nucleotides 11382 through 11703. There is an opal termination codon at 
nucleotides 5748 through 5750. 

Figure 6A shows nucleotides 1 through 6720 of the cDNA sequence 

of pTR339. 
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Figure 6B shows nucleotides 6721 through 11703 of the cDNA 
sequence of plR339. 

DETAILED DESCRIPTION OF THE INVENTION 
The production and use of recombinant DNA, vectors, transformed 
5 host cells, selectable markers, proteins, and protein fragments by genetic 
engineering are well-known to those skilled in the art. See, e.g., United States 
Patent No. 4,761,371 to Bell et al. at Col, 6 line 3 to Col. 9 line 65; United States 
Patent No. 4,877, 729 to Clark et al. at Col. 4 line 38 to Col. 7 line 6; United 
States Patent No. 4,912,038 to Schilling at Col 3 line 26 to Col 14 line 12; and 
10 United States Patent No. 4,879,224 to Wallner at Col. 6 line 8 to Col. 8 line 59. 

The term "alphavinis" has its conventional meaning in the art, and 
includes the various species of alphaviruses such as Eastern Equine Encephalitis 
virus (EEE), Venezuelan Equine Encephalitis virus (VEE), Everglades virus, 
Mucambo virus, Pixuna virus, Western Encephalitis virus (WEE), Sindbis virus, 

15 South African Arbovirus No. 86, Girdwood S.A. virus, Ockelbo virus, Semliki 
Forest virus, Middelburg virus, Chikungunya virus, O'Nyong-Nyong virus, Ross 
River virus, Bannah Forest virus, Getah vims, Sagiyama virus, Bebaru virus, 
Mayaro virus, Una virus, Aura virus, Whataroa virus, Babanki virus, Kyzlagach 
virus, Highlands J virus, Fort Morgan virus, Ndumu virus, Buggy Creek virus, 

20 and any other vims classified by the International Committee on Taxonomy of 
Viruses (ICTV) as an alphavinis. The preferred alphaviruses for use in the present 
invention include Sindbis virus strains (e.g. , TR339), Girdwood S.A., S.A.AR86, 
and Ockelbo82. 

An "Old World alphavinis" is a vims that is primarily distributed 
25 throughout the Old World. Alternately stated, an Old World alphavinis is a virus 
that is primarily distributed throughout Africa, Asia, Australia and New Zealand, 
or Europe. Exemplary Old World viruses include SF group alphaviruses and SIN 
group alphaviruses. SF group alphaviruses include Semliki Forest virus, 
Middelburg virus, Chikungunya virus, 0*Nyong-Nyong vims, Ross River virus, 
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Bannah Forest vims, Getah virus, Sagiyama virus, Bebaru vims, Mayaro vims, 
and Una virus. SIN group alphaviruses include Sindbis virus, South African 
Arbovirus No. 86,. Ockelbo virus, Girdwood S.A. virus, Aura virus, Whataroa 
vims, Babanki virus, and Kyzylagach virus. 

Acceptable alphaviruses include those co ntaining attenuating 
mutations. The phrases "attenuating mutation" and "attenuating amino acid," as 
used herein, mean a nucleotide sequence containing a mutation, or an amino acid 
encoded by a nucleotide sequence containing a mutation, which mutation results 
in a decreased probability of causing disease in its host (i.e., a loss of virulence), 
in accordance with standard terminology in the art, whether the mutation be a 
substitution mutation or an in-frame deletion mutation. See, e.g.,B. DAVIS ET 
AL., MICROBIOLOGY 132 (3d ed. 1980). The phrase "attenuating mutation- 
excludes mutations or combinations of mutations which would be lethal to the 
vims. 

Appropriate attenuating mutations will be dependent upon the 
alphavirus used. Suitable attenuating mutations within the alphavirus genome will 
be known to those skilled in the art. Exemplary attenuating mutations include, but 
are not limited to, those described in United States Patent No. 5,505,947 to 
Johnston et al., copending United States application 08/448,630 to Johnston et al., 
and copending United States application 08/446,932 to Johnston et al. It is 
intended that all United States patent references be incorporated in their entirety 
by reference. 

Attenuating mutations may be introduced into the RNA by 
•performing site-directed mutagenesis on the cDNA which encodes the RNA, in 
accordance with known procedures. See, Kunkel, Proc. Natl. Acad. ScL USA 82, 
488 (1985), the disclosure of which is incorporated herein by reference in its 
entirety. Alternatively, mutations may be introduced into the RNA by replacement 
of homologous restriction fragments in the cDNA which encodes for the RNA, in 
accordance with known procedures. 
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I. Methpds for Introducing and Expressing TT e t ero lopm, g PMA ;„ x>~-~ 
Marrow Cells. 

The present invention provides methods of using a recombinant 
alphavirus to introduce and express a heterologous RNA in bone marrow cells. 
Such methods are useful as vaccination strategies when the heterologous RNA 
encodes an immunogenic protein or peptide. Alternatively, such methods are 
useful in introducing and expressing in bone marrow cells an RNA which encodes 
a desirable protein or peptide, for example, a therapeutic protein or peptide. 

The present invention is carried out using a recombinant alphavirus 
to introduce a heterologous RNA into bone marrow cells. Any alphavirus that 
targets and infects bone marrow cells is suitable. Preferred alphaviruses include 
Old World alphaviruses, more preferably SF group alphaviruses and SIN group 
alphaviruses, more preferably Sindbis virus strains (.e.g., TR339), S.A.AR86 
virus, Girdwood S.A. virus, and Ockelbo virus. In a more preferred embodiment, 
the alphavirus contains one or more attenuating mutations, as described 
hereinabove. 

Two types of recombinant virus vector are contemplated in carrying 
out the present invention. In one embodiment employing "double promoter 
vectors," the heterologous RNA is inserted into a replication and propagation 
competent virus. Double promoter vectors are described in United States Patent 
No. 5,505,947 to Johnston et al. With this type of viral vector, it is preferable 
that heterologous RNA sequences of less than 3 kilobases are inserted into the viral 
vector, more preferably those less than 2 kilobases, and more preferably still those 
less than 1 kilobase. In an alternate embodiment, propagation-defective "replicon 
vectors," as described in copending United States application 08/448,630 to 
Johnston et al. , will be used. One advantage of replicon viral vectors is that larger 
RNA inserts, up to approximately 4-5 kilobases in length can be utilized. Double 
promoter vectors and replicon vectors are described in more detail hereinbelow. 
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The recombinant alphaviruses of the claimed method target the 
heterologous RNA to bone marrow cells, where it expresses the encoded protein or 
peptide. Heterologous RNA can be introduced and expressed in any cell type found in the 
bone marrow. Bone marrow cells that may be targeted by the recombinant alphaviruses of 
5 the present invention include, but are not limited to, polymorphonuclear cells, hemopoietic 
stem cells (including megakaryocyte colony forming units (CFU-M), spleen colony 
forming units (CFU-S), erythroid colony forming units (CFU-E), erythroid burst forming 
units (BFU-E), and colony forming units in culture (CFU-C), erythrocytes, macrophages 
(including reticular cells), monocytes, granulocytes, megakaryoctyes, lymphocytes, 
10 fibroblasts, osteoprogenitor cells, osteoblasts, osteoclasts, marrow stromal cells, 
chondrocytes and other cells of synovial joints. Preferably, marrow cells within the 
endosteum are targeted, more preferably osteoblasts. Also preferred are methods in which 
cells in the endosteum of synovial joints (e.g., hip and knee joints) are targeted. 

By targeting to the cells of the bone marrow, it is meant that the primary 
15 site in which the virus will be localized in vivo is the cells of the bone marrow. 
Alternately stated, the alphaviruses of the present invention target bone marrow cells, such 
that titers in bone marrow two days after infection are greater than 100 PFU/g crushed 
bone, preferably greater than 200 PFU/g crushed bone, more preferably greater than 300 
PFU/g crushed bone, and more preferably still greater than 500 PFU/g crushed bone. 
20 Virus may be detected occasionally in other cell or tissue types, but only sporadically and 
usually at low levels. Virus localization in the bone marrow can be demonstrated by any 
suitable technique known in the art, such as in situ hybridization. 

Bone marrow cells are long-lived and harbor infectious alphaviruses for a 
prolonged period of time, as demonstrated in the Examples below. These characteristics 
25 of bone marrow cells render the present invention useful not only for the purpose of 
supplying a desired protein or peptide to skeletal tissue, but also for expressing proteins or 
peptides in vivo that are needed by other cell or tissue types. 

The present invention can be carried out in vivo or with cultured bone 
marrow cells in vitro. Bone marrow cell cultures include primary cultures 
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of bone marrow cells, serially-passaged cultures of bone marrow cells, and 
cultures of immortalized bone marrow cell lines. Bone marrow cells may be 
cultured by any suitable means known in the art. 

The recombinant alphaviruses of the present invention cany a 
5 heterologous RNA segment. The heterologous RNA segment encodes a promoter 
and an inserted heterologous RNA. The inserted heterologous RNA may encode 
any protein or a peptide which is desirably expressed by the host bone marrow 
cells. Suitable heterologous RNA may be of prokaryotic (e.g. , RNA encoding the 
Botulinus toxin C), or eukaryotic (e.g., RNA encoding malaria Plasmodium 

10 protein csl) origin. Illustrative proteins and peptides encoded by the heterologous 
RNAs of the present invention include hormones, growth factors, interleukins, 
cytokines, chemokines, enzymes, and ribozymes. Alternately, the heterologous 
RNAs encode any therapeutic protein or peptide. As a further alternative, the 
heterologous RNAs of the present invention encode any immunogenic protein or 

15 peptide. 

An immunogenic protein or peptide, or "immunogen," may be any 
protein or peptide suitable for protecting the subject against a disease, including 
- but not limited to microbial, bacterial, protozoal, parasitic, and viral diseases. For 
example, the immunogen may be an orthomyxovirus immunogen (e.g., an 

20 influenza virus immunogen, such as the influenza virus hemagglutinin (HA) 
surface protein or the influenza virus nucleoprotein gene, or an equine influenza 
vims immunogen), or a lentivirus immunogen (e.g. , an equine infectious anemia 
virus immunogen, a Simian Immunodeficiency Virus (SIV) immunogen, or a 
Human Immunodeficiency Virus (HIV) immunogen, such as the HIV envelope 

25 GP160 protein and the HTV matrix/capsid proteins). The immunogen may also be 
an arenavirus immunogen (e.g., Lassa fever virus immunogen, such as the Lassa 
fever virus nucleocapsid protein gene and the Lassa fever envelope glycoprotein 
gene), a poxvirus immunogen (e.g., vaccinia), a flavivirus immunogen (e.g., a 
yellow fever virus immunogen or a Japanese encephalitis virus immunogen), a 

30 filovirus immunogen (e.g., an Ebola virus immunogen, or a Marburg vims 



SUBSTITUTE SHEET (RULE 26) 



WO 98/36779 



PCT/US98/02945 



-15- 

immunogen), a bunyavirus immunogen {e.g., RVFV, CCHF, and SFS viruses), 
or a coronavirus immunogen {e.g., an infectious human coronavirus immunogen, 
such as the human coronavirus envelope glycoprotein gene, or a transmissible 
gastroenteritis virus immunogen for pigs, or an infectious bronchitis virus 
immunogen for chickens). 

Alternatively, the present invention can be used to express 
heterologous RNAs encoding antisense oligonucleotides. In general, "antisense" 
refers to the use of small, synthetic oligonucleotides to inhibit gene expression by 
inhibiting the function of the target mRNA containing the complementary 
sequence. Milligan, J.F. et al., /. Med. Chem. 36(14), 1923-1937 (1993). Gene 
expression is inhibited through hybridization to coding (sense) sequences in a 
specific mRNA target by hydrogen bonding according to Watson-Crick base 
pairing rules. The mechanism of antisense inhibition is that the exogenously 
applied oligonucleotides decrease the mRNA and protein levels of the target gene. 
Milligan, J.F. et al., J. Med. Chem. 36(14), 1923-1937 (1993). See also Helene, 
C. and Toulme, J., Biochim. Biopbys. Acta 1049, 99-125 (1990); Cohen, J.S., 
Ed., OUGODEOXYNUCLEOTIDES AS ANTISENSE INHIBITORS OF GENE 
EXPRESSION, CRC Press:Boca Raton, FL (1987). 

Antisense oligonucleotides may be of any suitable length, depending 
on the particular target being bound. The only limits on the length of the antisense 
oligonucleotide is the- capacity of the virus for inserted heterologous RNA. 
Antisense oligonucleotides may be complementary to the entire mRNA transcript 
of the target gene or only a portion thereof. Preferably the antisense 
oligonucleotide is directed to an mRNA region containing a junction between 
intron and exon. Where the antisense oligonucleotide is directed to an intron/exon 
junction, it may either entirely overlie the junction or may be sufficiently close to 
the junction to inhibit splicing out of the intervening exon during processing of 
precursor mRNA to mature mRNA (e.g., with the 3 V or 5' terminus of the 
antisense oligonucleotide being positioned within about, for example, 10, 5, 3 or 
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2 nucleotides of the intron/exon junction). Also preferred are antisense 
oligonucleotides which overlap the initiation codon. 

When practicing the present invention, the antisense oligonucleotides 
administered may be related in origin to the species to which it is administered. 
When treating humans, human antisense may be used if desired. 

Promoters for use in carrying out the present invention are operable 
in bone marrow cells. An operable promoter in bone marrow cells is a promoter 
that is recognized by and functions in bone marrow cells. Promoters for use with 
the present invention must also be operatively associated with the heterologous 
RNA to be expressed in the bone marrow. A promoter is operably linked to a 
heterologous RNA if it controls the transcription of the heterologous RNA, where 
the heterologous RNA comprises a coding sequence. Suitable promoters are well 
known in the art. The Sindbis 26S promoter is preferred when the alphavirus is 
a strain of Sindbis virus. Additional preferred promoters beyond the Sindbis 26S 
promoter include the Girdwood S.A. 26S promoter when the alphavirus is 
Girdwood S.A., the S.A.AR86 26S promoter when the alphavirus is S.A.AR86, 
and any other promoter sequence recognized by alphavirus polymerases. 
Alphavirus promoter sequences containing mutations which alter the activity level 
of the promoter (in relation to the activity level of the wild-type) are also suitable 
in the practice of the present invention. Such mutant promoter sequences are 
described in Raju and Huang, /. Virol. 65, 2501-2510 (1991), the disclosure of 
which is incorporated in its entirety by reference. 

The heterologous RNA is introduced into the bone marrow cells by 
contacting the recombinant alphavims carrying the heterologous RNA segment to 
the bone marrow cells. By contacting, it is meant bringing the recombinant 
alphavirus and the bone marrow cells in physical proximity. The contacting step 
can be performed in vitro or in vivo. In vitro contacting can be carried out with 
cultures of immortalized or non-immortalized bone marrow cells. In one particular 
embodiment, bone marrow cells can be removed from a subject, cultured in vitro, 
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infected with the vector, and then introduced back into the subject. Contacting is 
performed in vivo when the recombinant alphavirus is administered to a subject. 
Pharmaceutical formulations of recombinant alphavirus can be adrninistered to a 
subject parenterally (e.g. , subcutaneous, intracerebral, intradermal, intramuscular, 
intravenous and intraarticular) administration. Alternatively, pharmaceutical 
formulations of the present invention may be suitable for actainistration to the 
mucus membranes of a subject (e.g., intranasal administration, by use of a 
dropper, swab, or inhaler). Methods of preparing infectious virus particles and 
pharmaceutical formulations thereof are discussed in more detail hereinbelow. 

By "introducing" the heterologous RNA segment into the bone 
marrow cells it is meant infecting the bone marrow cells with recombinant 
alphavirus containing the heterologous RNA, such that the viral vector carrying the 
heterologous RNA enters the bone marrow cells and can be expressed therein. As 
used with respect to the present invention, when the heterologous RNA is 
"expressed," it is meant that the heterologous RNA is transcribed. In particular 
embodiments of the invention in which it is desired to produce a protein or 
peptide, expression further includes the steps of post-transcriptional processing and 
translation of the mRNA transcribed from the heterologous RNA. In contrast, 
where the heterologous RNA encodes an antisense oligonucleotide, expression need 
not include post-transcriptional processing and translation. With respect to 
embodiments in which the heterologous RNA encodes an immunogenic protein or 
a protein being adniinistered for therapeutic purposes, expression may also include 
the further step of post-translational processing to produce an immunogenic or 
therapeutically-active protein. 



25 



The present invention also provides infectious RNAs, as described 
hereinabove, and cDNAs encoding the same. Preferably the infectious RNAs and 
cDNAs are derived from the S.A.AR86, Girdwood S.A., TR339, or Ockelbo 
viruses. The cDNA clones can be generated by any of a variety of suitable 
methods known to those skilled in the art. A preferred method is the method set 
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forth in United States Patent No. 5,185,440 to Davis et al. t the disclosure of which 
is incorporated in its entirety by reference, and Gubler et al. t Gene 25:263 (1983). 

RNA is preferably synthesized from the DNA sequence in vitro 
using purified RNA polymerase in the presence of ribonucleotide triphosphates and 
cap analogs in accordance with conventional techniques. However, the RNA may 
also be synthesized intracellularly after introduction of the cDNA. 

A. Double Promoter Vecto^ . 

In one embodiment of the invention, double promoter vectors are 
used to introduce the heterologous RNA into the target bone marrow cells. A 
double promoter virus vector is a replication and propagation competent virus. 
Double promoter vectors are described in United States Patent No. 5,505,947 to 
Johnston et al., the disclosure of which is incorporated in its entirety by reference. 
Preferred alphaviruses for constructing the double promoter vectors are S.A.AR86, 
Girdwood S.A., TR339 and Ockelbo viruses. More preferably, the double 
promoter vector contains one or more attenuating mutations. Attenuating 
mutations are described in more detail hereinabove. 

The double promoter vector is constructed so as to contain a second 
subgenomic promoter {i.e., 26S promoter) inserted 3' to the vims RNA encoding 
the structural proteins. The heterologous RNA is inserted between the second 
subgenomic promoter, so as to be operatively associated therewith, and the 3 ' 
UTR of the virus genome. Heterologous RNA sequences of less than 3 kilobases, 
more preferably those less than 2 kilobases, and more preferably still those less 
than 1 kilobase, can be inserted into the double promoter vector. In a preferred 
embodiment of the invention, the double promoter vector is derived from 
Girdwood S.A., and the second subgenomic promoter is a duplicate of the 
Girdwood S.A. subgenomic promoter. In an alternate preferred embodiment, the 
double promoter vector is derived from TR339, and the second subgenomic 
promoter is a duplicate of the TR339 subgenomic promoter. 
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B. Replicon Vectors. 

Replicon vectors, which are propagation-defective virus vectors can 
also be used to cany out the present invention. Replicon vectors are described in 
more detail in copending United States Application 08/448,630 to Johnston et al. f 
5 the disclosure of which is incorporated in its entirety by reference. Preferred 
alphaviruses for constructing the replicon vectors are S.A.AR86, Girdwood S.A., 
TR339, and Ockelbo. 

In general, in the replicon system, a foreign gene to be expressed 
is inserted in place of at least one of the viral structural protein genes in a 

10 transcription plasmid containing an otherwise full-length cDNA copy of the 
alphavirus genome RNA. RNA transcribed from this plasmid contains an intact 
copy of the viral nonstructural genes which are responsible for RNA replication 
and transcription. Thus, if the transcribed RNA is transfected into susceptible 
cells, it will be replicated and translated to give the nonstructural proteins. These 

15 proteins will transcribe the transfected RNA to give high levels of subgenomic 
mRNA, which will then be translated to produce high levels of the foreign protein. 
The autonomously replicating RNA (i.e. , replicon) can only be packaged into virus 
particles if the alphavirus structural protein genes are provided on one or more 
"helper" RNAs, which are cotransfected into cells along with the replicon RNA. 

20 The helper RNAs do not contain the Yiral nonstructural genes for replication, but 
these functions are provided in trans by the replicon RNA. Similarly, the 
transcriptase functions translated from the replicon RNA transcribe the structural 
protein genes on the helper RNA, resulting in the synthesis of viral structural 
proteins and packaging of the replicon into virus-like particles. As the packaging 

25 or encapsidation signal for alphavirus RNAs is located within the nonstructural 
genes, the absence of these sequences in the helper RNAs precludes their 
incorporation into virus particles. 

Alphavirus-pennissive cells employed in the methods of the present 
invention are cells which, upon transfection with the viral RNA transcript, are 
30 capable of producing viral particles. Preferred alphavirus-pennissive cells are 
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TR339-permissive cells, Girdwood S.A.-pennissive cells, S.A.AR86-pennissivecells f and 
Ockelbo-permissive cells. Alphaviruses have a broad host range. Examples of suitable 
host cells include, but are not limited to Vero cells, baby hamster kidney (BHK) cells, and 
chicken embryo fibroblast cells. 

5 The phrase "structural protein" as used herein refers to the encoded 

proteins which are required for encapsidation (e.g., packaging) of the RNA replicon, and 
include the capsid protein, El glycoprotein, and E2 glycoprotein. As described 
hereinabove, the structural proteins of the alphavirus are distributed among one or more 
helper RNAs (/.*., a first helper RNA and a second helper RNA). In addition, one or 

10 more structural proteins may be located on the same RNA molecule as the replicon RNA, 
provided that at least one structural protein is deleted from the replicon RNA such that the 
resulting alphavirus particle is propagation defective. As used herein, the terms "deleted" 
or "deletion'' mean either total deletion of the specified segment or the deletion of a 
sufficient portion of the specified segment to render the segment inoperative or 

15 nonfunctional, in accordance with standard usage. See, e.g., U.S. Patent No. 4,650,764 
to Temin et al. The term "propagation defective" as used herein, means that the replicon 
RNA cannot be encapsidated in the host cell in the absence of the helper RNA. The 
resulting alphavirus replicon particles are propagation defective inasmuch as the replicon 
RNA in these particles does not include all of the alphavirus structural proteins required 

20 for encapsidation, at least one of the required structural proteins being deleted therefrom, 
such that the replicon RNA initiates only an abortive infection; no new viral particles are 
produced, and there is no spread of the infection to other cells. 

The helper cell for expressing the infectious, propagation defective alphavirus 
particle comprises a set of RNAs, as described above. The set of RNAs principally 
25 include a first helper RNA and a second helper RNA. The first helper RNA includes 
RNA encoding at least one alphavirus structural protein but does not encode all alphavirus 
structural proteins. In other words, the first helper RNA does not encode at least one 
alphavirus structural protein; the at least one non-coded alphavirus structural protein being 
deleted from the first helper RNA. 
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In one embodiment, the first helper RNA includes RNA encoding the alphavirus 
El glycoprotein, with the alphavirus capsid protein and the alphavirus E2 
glycoprotein being deleted from the first helper RNA. In another embodiment, the 
first helper RNA includes RNA encoding the alphavirus E2 glycoprotein, with the 
alphavirus capsid protein and the alphavirus El glycoprotein being deleted from 
the first helper RNA. In a third, preferred embodiment, the first helper RNA 
includes RNA encoding the alphavirus El glycoprotein and the alphavirus E2 
glycoprotein, with the alphavirus capsid protein being deleted from the first helper 
RNA. 

The second helper RNA includes RNA encoding at least one 
alphavirus structural protein which is different from the at least one structural 
protein encoded by the first helper RNA. Thus, the second helper RNA encodes 
at least one alphavirus structural protein which is not encoded by the first helper 
RNA. The second helper RNA does not encode the at least one alphavirus 
structural protein which is encoded by the first helper RNA, thus the first and 
second helper RNAs do not encode duplicate structural proteins. In the 
embodiment wherein the first helper RNA includes RNA encoding only the 
alphavirus El glycoprotein, the second helper RNA may include RNA encoding 
one or both of the alphavirus capsid protein and the alphavirus E2 glycoprotein 
which are deleted from the first helper RNA. In the embodiment wherein, the first 
helper RNA includes RNA encoding only the alphavirus E2 glycoprotein, the 
second helper RNA may include RNA encoding one or both of the alphavirus 
capsid protein and the alphavirus El glycoprotein which are deleted from the first 
helper RNA. In the embodiment wherein the first helper RNA includes RNA 
encoding both the alphavirus El glycoprotein and the alphavirus E2 glycoprotein, 
the second helper RNA may include RNA encoding the alphavirus capsid protein 
which is deleted from the first helper RNA. 

In one embodiment, the packaging segment (RNA comprising the 
encapsidaUon or packaging signal) is deleted from at least the first helper RNA. 
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In a preferred embodiment, the packaging segment is deleted from both the first 
helper RNA and the second helper RNA. 

In the preferred embodiment wherein the packaging segment is 
deleted from both the first helper RNA and the second helper RNA, the helper cell 
is co-transfected with a replicon RNA in addition to the first helper RNA and the 
second helper RNA. The replicon RNA encodes the packaging segment and an 
inserted heterologous RNA. The inserted heterologous RNA may be RNA 
encoding a protein or a peptide. In a preferred embodiment, the replicon RNA, 
the first helper RNA and the second helper RNA are provided on separate 
molecules such that a first molecule, i.e., the replicon RNA, includes RNA 
encoding the packaging segment and the inserted heterologous RNA, a second 
molecule, i.e., the first helper RNA, includes RNA encoding at least one but not 
all of the required alphavirus structural proteins, and a third molecule, i.e., the 
second helper RNA, includes RNA encoding at least one but not all of the required 
alphavirus structural proteins. For example, in one preferred embodiment of the 
present invention, the helper cell includes a set of RNAs which include (a) a 
replicon RNA including RNA encoding an alphavirus packaging sequence and an 
inserted heterologous RNA, (b) a first helper RNA including RNA encoding the 
alphavirus El glycoprotein and the alphavirus E2 glycoprotein, and (c) a second 
helper RNA including RNA encoding the alphavirus capsid protein so that the 
alphavirus El glycoprotein, the alphavirus E2 glycoprotein and the capsid protein 
assemble together into alphavirus particles in the host cell. 

In an alternate embodiment, the replicon RNA and the first helper 
RNA are on separate molecules, and the replicon RNA and RNA encoding a 
structural gene not encoded by the first helper RNA are on another single molecule 
together, such that a first molecule/ i.e., the first helper RNA, including RNA 
encoding at least one but not all of the required alphavirus structural proteins, and 
a second molecule, i.e. , the replicon RNA, including RNA encoding the packaging 
segment, the inserted heterologous RNA. and the remaining structural proteins not 
encoded by the first helper RNA. For example, in one preferred embodiment of 
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the present invention, the helper cell includes a set of RNAs including (a) a 
replicon RNA including RNA encoding an alphavirus packaging sequence, an 
inserted heterologous RNA, and an alphavirus capsid protein, and (b) a first helper 
RNA including RNA encoding the alphavirus El glycoprotein and the alphavirus 
E2 glycoprotein so that the alphavirus El glycoprotein, the alphavirus E2 
glycoprotein and the capsid protein assemble together into alphavirus particles in 
the host cell, with the replicon RNA packaged therein. 

In one preferred embodiment of the present invention, the RNA 
encoding the alphavirus structural proteins, i.e., the capsid, El glycoprotein and 
E2 glycoprotein, contains at least one attenuating mutation, as described 
hereinabove. Thus, according to this embodiment, at least one of the first helper 
RNA and the second helper RNA includes at least one attenuating mutation. In 
a more preferred embodiment, at least one of the first helper RNA and the second 
helper RNA includes at least two, or multiple, attenuating mutations. The multiple 
15 attenuating mutations may be positioned in either the first helper RNA or in the 
second helper RNA, or they may be distributed randomly with one or more 
attenuating mutations being positioned in the first helper RNA and one or more 
attenuating mutations positioned in the second helper RNA. Alternatively, when 
the replicon RNA and the RNA encoding the structural proteins not encoded by 
20 the first helper RNA are located on the same molecule, an attenuating mutation 
may be positioned in the RNA which codes for the structural protein not encoded 
by the first helper RNA. The attenuating mutations may also be located within the 
RNA encoding non-structural proteins (e.g., the replicon RNA). 

Preferably, the first helper RNA and the second helper RNA also 
25 include a promoter. It is also preferred that the replicon RNA also includes a 
promoter. Suitable promoters for inclusion in the first helper RNA, second helper 
RNA and replicon RNA are well known in the art. One preferred promoter is the 
Girdwood S.A. 26S promoter for use when the alphavirus is Girdwood S.A. 
Another preferred promoter is the TR339 26S promoter for use when the 
30 alphavirus is TR339. Additional promoters beyond the Girdwood S.A. and TR339 
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promoters include the VEE 26S promoter/the Sindbis 26S promoter, the Semliki 
Forest 26S promoter, and any other promoter sequence recognized by alphavirus 
polymerases. Alphavirus promoter sequences containing mutations which alter the 
activity level of the promoter (in relation to the activity level of the wild-type) are 
also suitable in the practice of the present invention. Such mutant promoter 
sequences are described in Raju and Huang, /. Virol. 65, 2501-2510 (1991), the 
disclosure of which is incorporated herein in its entirety. In the system wherein 
the first helper RNA, the second helper RNA, and the replicon RNA are all on 
separate molecules, the promoters, if the same promoter is used for all three 
RNAs, provide a homologous sequence between the three molecules. It is 
preferred that the selected promoter is operative with the non-structural proteins 
encoded by the replicon RNA molecule. 

In cases where vaccination with two immunogens provides improved 
protection against disease as compared to vaccination with only a single 
immunogen, a double-promoter replicon would ensure that both immunogens are 
produced in the same cell. Such a replicon would be the same as the one 
described above, except that it would contain two copies of the 26S RNA 
promoter, each followed by a different multiple cloning site, to allow for the 
insertion and expression of two different heterologous proteins. Another useful 
strategy is to insert the IRES sequence from the picornavirus, EMC virus, between 
the two heterologous genes downstream from the single 26S promoter of the 
replicon described above, thus leading to expression of two immunogens from the 
single replicon transcript in the same cell. 

C, Uses Of the Present Inv^nti^, 

The alphavirus vectors, RNAs, cDNAs, helper cells, infectious virus 
particles, and methods of the present invention find use in in vitro expression 
systems, wherein the inserted heterologous RNA encodes a protein or peptide 
which is desirably produced in vitro. The RNAs, cDNAs, helper cells, infectious 
virus particles, methods, and pharmaceutical formulations of the present invention 
are additionally useful in a method of adniinistering a protein or peptide to a 



SUBSTITUTE SHEET (RULE 26) 



WO 98/36779 PCT/US98/02945 

-25- 

subject in need of the protein or peptide, as a method of treatment or otherwise. 
In this embodiment of the invention, the heterologous RNA encodes the desired 
protein or peptide, and pharmaceutical formulations of the present invention are 
administered to a subject in need of the desired protein or peptide. In this manner, 
5 the protein or peptide may thus be produced in vivo in the subject. The subject 
may be in need of the protein or peptide because the subject has a deficiency 
thereof, or because the production of the protein or peptide in the subject may 
impart some therapeutic effect, as a method of treatment or otherwise. 

Alternately, the claimed methods provide a vaccination strategy, 
10 wherein the heterologous RNA encodes an immunogenic protein or peptide. 

The methods and products of the invention are also useful as 
antigens and for evoking the production of antibodies in animals such as hones 
and rabbits, from which the antibodies may be collected and then used in 
diagnostic assays in accordance with known techniques.. 

15 A further aspect of the present invention is a method of introducing 

and expressing antisense oligonucleotides in bone marrow cell cultures to regulate 
gene expression. Alternately, the claimed method finds use in introducing and 
expressing a protein or peptide in bone marrow cell cultures. 
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II, Girdwbod S.A. and TR339 C1nn fgl 

Disclosed hereinbelow are genomic RNA sequences encoding live 
Girdwood S.A. virus, live S.A.AR86 virus, and live Sindbis strain TO339 virus, 
cDNAs derived therefrom, infectious RNA transcripts encoded by the cDNAs, 
infectious viral particles containing the infectious RNA transcripts, and 
pharmaceutical formulations derived therefrom. 

25 The cDNA sequence of Girdwood S.A. is given herein as SEQ ID 

NO:4. Alternatively, the cDNA may have a sequence which differs from the 
cDNA of SEQ ID NO:4, but which has the same protein sequence as the cDNA 
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given herein as SEQ ID NO:4. 
mutations. 

The phrase "silent mutation" as used herein refers to mutations in 
the cDNA coding sequence which do not produce mutations in the corresponding 
5 protein sequence translated therefrom. 

Likewise, the cDNA sequence of TR339 is given herein as SEQ ID 
NO:8. Alternatively, the cDNA may have a sequence which differs from the 
cDNA of SEQ ID NO:8, but which has the same protein sequence as the cDNA 
given herein as SEQ ID NO:8. Thus, the cDNA may include one or more silent 
10 mutations. 

The cDNAs encoding infectious Girdwood S.A. and TR339 virus 
RNA transcripts of the present invention include those homologous to, and having 
essentially the same biological properties as, the cDNA sequences disclosed herein 
as SEQ ID NO:4 and SEQ ID NO:8, respectively. Thus, cDNAs that hybridize 

15 to cDNAs encoding infectious Girdwood S.A. or TR339 virus RNA transcripts 
disclosed herein are also an aspect of this invention. Conditions which will permit 
other cDNAs encoding infectious Girdwood S.A. or TR339 vims transcripts to 
hybridize to the cDNAs disclosed herein can be determined in accordance with 
Icnown techniques. For example, hybridization of such sequences may be carried 

20 out under conditions of reduced stringency, medium stringency, or even high 
stringency conditions (e.g. , conditions represented by a wash stringency of 35-40% 
formamide with 5X Denhardt's solution, 0.5% SDS and IX SSPE at 37°C; 
conditions represented by a wash stringency of 40-45% formamide with 5X 
Denhardt's solution, 0.5 % SDS, and IX SSPE at 42°C; and conditions represented 

25 by a wash stringency of 50% formamide with 5X Denhardt's solution, 0.5% SDS 
and IX SSPE at 42°C t respectively, to cDNA encoding infectious Girdwood S.A. 
or TR339 virus RNA transcripts disclosed herein in a standard hybridization assay. 
See J. SAMBROOK ET AL., MOLECULAR CLONING: A LABORATORY 
MANUAL (2d ed. 1989)). In general, cDNA sequences encoding infectious 
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Girdwood S.A. or TR339 virus RNA transcripts that hybridize to the cDNAs 
disclosed herein will be at least 30% homologous, 50% homologous, 75% 
homologous, and even 95% homologous or more with the cDNA sequences 
encoding infectious Girdwood S.A. or TR339 virus RNA transcripts disclosed 
herein. 

Promoter sequences and Girdwood S.A. virus or Sindbis virus strain 
TR339 cDNA clones are operatively associated in the present invention such that 
the promoter causes the cDNA clone to be. transcribed in the presence of an RNA 
polymerase which binds to the promoter. The promoter is positioned on the 5' end 
(with respect to the virion RNA sequence), of the cDNA clone. An excessive 
number of nucleotides between the promoter sequence and the cDNA clone will 
result in the inoperability of the construct. Hence, the number of nucleotides 
between the promoter sequence and the cDNA clone is preferably not more than 
eight, more preferably not more than five, still more preferably not more than 
three, and most preferably not more than one. 

Examples of promoters which are useful in the cDNA sequences of 
the present invention include, but are not limited to T3 promoters, T7 promoters, 
cytomegalovirus (CMV) promoters, and SP6 promoters. The DNA sequence of 
the present invention may reside in any suitable transcription vector. The DNA 
sequence preferably has a complementary DNA sequence bound thereto so that the 
double-stranded sequence will serve as an active template for RNA polymerase. 
The transcription vector preferably comprises a plasmid. When the DNA sequence 
comprises a plasmid, it is preferred that a unique restriction site be provided 3' 
(with respect to the virion RNA sequence) to the cDNA clone. This provides a 
means for linearizing the DNA sequence to allow the transcription of genome- 
length RNA in vitro. 

The cDNA clones can be generated by any of a variety of suitable 
methods known to those skilled in the art. A preferred method is the method set 
forth in United States Patent No. 5,185,440 to Davis et al. , the disclosure of which 
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is incorporated in its entirety by reference, and Gubler et al.. Gene 25:263 (1983). 

RNA is preferably synthesized from the DNA sequence in vitro 
using purified RNA polymerase in the presence of ribonucleotide triphosphates and 
cap analogs in accordance with conventional techniques. However, the RNA may 
also be synthesized inrracellularly after introduction of the cDNA. 
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The Girdwood S.A. and TR339 cDNA clones and the infectious 
RNAs and infectious virus particles produced therefrom of the present invention 
are useful for the preparation of pharmaceutical formulations, such as vaccines. 
In addition, the cDNA clones, infectious RNAs, and infectious viral particles of 
10 the present invention are useful for adnainistration to animals for the purpose of 
producing antibodies to the Girdwood S.A. virus or the Sindbis virus strain 
TR339, which antibodies may be collected and used in known diagnostic 
techniques for the detection of Girdwood S.A. virus or Sindbis virus strain TR339. 
Antibodies can also be generated to the viral proteins expressed from the cDNAs 
disclosed herein. As another aspect of the present invention, the claimed cDNA 
clones are useful as nucleotide probes to detect the presence of Girdwood S.A. or 
TR339 genomic RNA or transcripts. 

ffl T Infections Vims Partielre and P harmaeEiitiral fprmulatinp c, 

The infectious virus particles of the present invention include those 
containing double promoter vectors and those containing replicon vectors as 
described hereinabove. Alternately, the infectious virus particles contain infectious 
RNAs encoding the Girdwood S.A. or TR339 genome. When the infectious RNA 
comprises the Girdwood S.A. genome, preferably the RNA has the sequence 
encoded by the cDNA given as SEQ ID NO:4. When the infectious RNA 
comprises the TR339 genome, preferably the RNA has the sequence encoded by 
the cDNA given as SEQ ID NO:8. 

The infectious, alphavirus particles of the present invention may be 
prepared according to the methods disclosed herein in combination with techniques 
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known to those skilled in the an. These methods include transfecting an 
alphavirus-permissive cell with a replicon RNA including the alphavirus packaging 
segment and an inserted heterologous RNA, a first helper RNA including RNA 
encoding at least one alphavirus structural protein, and a second helper RNA 
including RNA encoding at least one alphavirus structural protein which is 
different from that encoded by the first helper RNA. Alternately, and preferably, 
at least one of the helper RNAs is produced from a cDNA encoding the helper 
RNA and operably associated with an appropriate promoter, the cDNA being 
stably transfected and integrated into the cells. More preferably, all of the helper 
RNAs will be "launched" from stably transfected cDNAs. The step of transfecting 
the alphavirus-permissive cell can be carried out according to any suitable means 
known to those skilled in the art, as described above with respect to propagation- 
competent viruses. 

Uptake of propagation-competent RNA into the cells in vitro can be 
carried out according to any suitable means known to those skilled in the art. 
Uptake of RNA into the cells can be achieved, for example, by treating the cells 
with DEAE-dextran, treating the RNA with UPOFECTIN* before addition to the 
cells, or by electroporation, with electroporation being the currently preferred 
means. These techniques are well known in the art. See e.g., United States 
Patent No. 5,185,440 to Davis et al., and PCT Publication No. WO 92/10578 to 
Bioption AB, the disclosures of which are incorporated herein by reference in their 
entirety. Uptake of propagation-competent RNA into the cell in vivo can be 
carried out by aoministering the infectious RNA to a subject as described in 
Section I above. . 

The infectious RNAs may also contain a heterologous RNA 
segment, where the heterologous RNA segment contains a heterologous RNA and 
a promoter operably associated therewith. It is preferred that the infectious RNA 
introduces and expresses the heterologous RNA in bone marrow cells as described 
in Section I above. According to this embodiment, it is preferable that the 
promoter operatively associated with the heterologous RNA is operable in bone 
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marrow cells. The heterologous RNA may encode any protein or peptide, 
preferably an immunogenic protein or peptide, a therapeutic protein or peptide, a 
hormone, a growth factor, an interleukin, a cytokine, a chemokke, an enzyme, 
a ribozyme, or ah antisense oligonucleotide as described in more detail in Section 
I above. 

The step of facilitating the production of the infectious viral particles 
in the cells may be carried out using conventional techniques. See e.g., United 
States Patent No. 5,185,440 to Davis et al.,.PCT Publication No. WO 92/10578 
to Bioption AB, and United States Patent No. 4,650,764 to Temin et al. (although 
Temin et al. , relates to retroviruses rather than alphaviruses). . The infectious viral 
particles may be produced by standard cell culture growth techniques. 

The step of collecting the infectious virus particles may also be 
carried out using conventional techniques. For example, the infectious particles 
may be collected by cell lysis, or collection of the supernatant of the cell culture, 
as is known in the art. See e.g., United States Patent No. 5,185,440 to Davis et 
al., PCT Publication No. WO 92/10578 to Bioption AB, and United States Patent 
No. 4,650,764 to Temin et al. Other suitable techniques will be known to those 
skilled in the art. Optionally, the collected infectious virus particles may be 
purified if desired. Suitable purification techniques are well known to those skilled 
•in the art. 

Pharmaceutical formulations, such as vaccines, of the present 
invention comprise an immunogenic amount of the infectious, virus particles in 
combination with a pharmaceutically acceptable carrier. An "immunogenic 
amount" is an amount of the infectious virus particles which is sufficient to evoke 
an immune response in the subject to which the pharmaceutical formulation is 
administered. An amount of from about 10 1 to about 10 7 particles, and preferably 
about 10* to 10 s particles per dose is believed suitable, depending upon the age and 
species of the subject being treated, and the immunogen against which the immune 
response is desired. 
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Pharmaceutical formulations of the present invention for therapeutic 
use comprise a therapeutic amount of the infectious virus particles in combination 
with a pharmaceutically acceptable carrier. A "therapeutic amount" is an amount 
of the infectious virus particles which is sufficient to produce a therapeutic effect 
5 (e.g. , triggering an immune response or supplying a protein to a subject in need 
thereof) in the subject to which the pharmaceutical formulation is administered. 
The therapeutic amount will depend upon the age and species of the subject being 
treated, and the therapeutic protein or peptide being administered. Typical dosages 
are an amount from about 10' to about 10 5 infectious units. 

10 Exemplary pharmaceuticaUy acceptable carriers include, but are not 

limited to, sterile pyrogen-free water and sterile pyrogen-frce physiological saline 
solution. Subjects which may be adininistered immunogenic amounts of the 
infectious virus particles of the present invention include but are not limited to 
human and animal (e.g., pig, cattle, dog, horse, donkey, mouse, hamster, 

15 monkeys) subjects. 

Pharmaceutical formulations of the present invention include those 
suitable for parenteral (e.g., subcutaneous, intracerebral, intradermal, 
intramuscular, intravenous and intraarticular) administration. Alternatively, 
pharmaceutical formulations of the present invention may be suitable for 
adinmistration to the mucus membranes of a subject (e.g. , intranasal administration 
by use of a dropper, swab, or inhaler). The formulations may be conveniently 
prepared in unit dosage form and may be prepared by any of the methods well 
known in the art. 
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The following examples are provided to illustrate the present 
invention, and should not be construed as limiting thereof. In these examples, 
PBS means phosphate buffered saline, EDTA means ethylene diamine tetraacetate, 
ml means milliliter, M l means microliter, mM means millimolar, fiM means 
micromolar, u means unit, PFU means plaque forming units, g means gram, mg 
means milligram, pg means microgram, cpm means counts per minute, ic means 
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intracerebral or intracerebrally, ip means intraperitoneal or intraperitoneally, iv 
means intravenous or intravenously, and sc means subcutaneous or subcutaneously . 

Amino acid sequences disclosed herein are presented in the amino 
to carboxyl direction, from left to right. The amino and carboxyl groups are not 
5 presented in the sequence. Nucleotide sequences are presented herein by single 
strand only in the 5' to 3' direction, from left to right. Nucleotides and amino 
acids are represented herein in the manner recommended by the IUPAC-IUB 
Biochemical Nomenclature Commission, or (for amino acids) by either one letter 
or three letter code, in accordance with 37 CFR § 1.822 and established usage. 
10 Where one letter amino acid code is used, the same sequence is also presented 
elsewhere in three letter code. 

EXAMPLE I 
Cells and Virus Stocks 
S. A.AR86 was isolated in 1954 from a pool of Culex sp. mosquitoes 

15 collected near Johannesburg, South Africa. Weinbren et al., S. Afr. Med. J. 30, 
631-36 (1956). Ockelbo82 was isolated from Culiseta sp. mosquitoes collected in 
Edsbyn, Sweden in 1982 and was associated serologically with human disease. 
Niklasson et al,, Am. J. Trop. Med. Hyg. 33, 1212-17 (1984). Girdwood S.A. 
was isolated from a human patient in the Johannesburg area of South-Africa in 

20 1963. Malherbe et al., S. Afr. Med. J. 37, 547-52 (1963). Molecularly cloned 
virus TR339 represents the deduced consensus sequence of Sindbis AR339. 
McKnight et al., J. Virol 70, 1981-89 (1996); William Klimstra, personal 
communication. TRSB is a laboratory strain of Sindbis isolate AR339 derived 
from a cDNA clone pTRSB and differing from the AR339 consensus sequence at 

25 three codons. McKnight et al., /. Virol. 70, 1981-89 (1996). pTR5000 is a full- 
length cDNA clone of Sindbis AR339 following the SP6 phage promoter and 
containing mostly Sindbis AR339 sequences. 

Stocks of all molecularly cloned viruses were prepared by 
electroporating genome length in vitro transcripts of their respective cDNA clones 
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in BHK-21 cells. Heidner et al., J. Virol 68, 2683-92 (1994). Girdwood S.A. 
(Malherbe et al., S. Afr. Med. J. 37, 547-52 (1963)) and OckeIbo82 (Espmark and 
Niklasson, Am. J. Trop. Med. Hyg. 33, 1203-11 (1984); Niklasson et al M Am. J. 
Trop. Med. Hyg. 33, 1212-17 (1984)) were passed one to three times in BHK-21 
5 cells in order to produce amplified stocks of virus. All virus stocks were 
stored at -70°C until needed. The titers of the virus stocks were determined on 
BHK-21 cells from aliquots of frozen virus. 

EXAMPLE 2 

Cloning the S.A.AR86 and Girdw ood S.A. Genomic Seq uent 
10 The sequences of S.A.AR86 (Figure 1, SEQ ID NO: 1) and 

Girdwood S.A. (Figure 3, SEQ ID NO:4) were determined from uncloned reverse 
transcriptase-polymerase chain reaction (RT-PCR) fragments amplified from virion 
RNA. Heidner et al., J. Virol. 68, 2683-92 (1994). The sequence of the 5' 40 
nucleotides was determined by directly sequencing the genomic RNA. Sanger et 
15 al., Proc. Natl. Acad. Sci. USA 74, 5463-67 (1977); Zimmern and Kaesberg, 
Proc. Natl. Acad. Sci. USA 75, 4257-61 (1978); Ahlquist et al., Cell 23, 183-89 
(1981). 

The S.A.AR86 genome was 11,663 nucleotides in length, excluding 
the 5' CAP and 3'poly(A) tail, 40 nucleotides shorter than the alphavirus prototype 

20 Sindbis strain AR339. Strauss et al., Virology 133, 92-110 (1984). Compared 
with the consensus sequence of Sindbis virus AR339 (McKnight et al., J. Virol 
70 1981-89 (1996)), S.A.AR86 contained two separate 6-nucleotide insertions, and 
one 3-nucleotide insertion in the 3' half of the nsP3 gene, a region not well 
conserved among alphaviruses. The two 6-nucleotide insertions were found 

25 immediately 3' of nucleotides 5403 and 5450, and the 3-nucleotide insertion was 
immediately 3 ' of nucleotide 5546 compared with the AR339 genome. In addition, 
S.A.AR86 contained a 54-nucleotide deletion in nsP3 which spanned nucleotides 
5256 to 5311 of AR339. As a result of these deletions and insertions, S.A.AR86 
nsP3 was 13 amino acids smaller than AR339, containing an 18-amino acid 

30 deletion and a total of 5 amino acids inserted. The 3' untranslated region of 
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S.A.AR86 contained, with respect to AR339, two 1-nucleotide deletions at 
nucleotides 11,513 and 11,602, and one 1-nucleotide insertion following nucleotide 
11,664. The total numbers of nucleotides and predicted amino acids comprising 
the re mainin g genes of S.A.AR86 were identical to those of AR339. 

A notable feature of the deduced amino acid sequence of S.A.AR86 
(Figure 2, SEQ ID NO:2 and SEQ ID NO:3) was the cysteine codon in place of 
an opal termination codon between nsP3 and nsP4. S.A.AR86 is the only 
alphavirus of the Sindbis group, and one of just three alphavirus isolates sequenced 
to date, which do not contain an opal termination codon between nsP3 and nsP4. 
Takkinen, K., Nucleic Acids Res. 14, 5667-5682 (1986); Strauss et al., Virology 
164, 265-74 (1988). 

The genome of Girdwood S.A. was 11,717 nucleotides long 
excluding the 5' CAP and 3' poly(A) tail. The nucleotide sequence (SEQ ID 
NO:4) of the Girdwood S.A. genome and the putative amino acid sequence (SEQ 
ID NO:5 and SEQ ID NO:6) of the Girdwood S.A. gene products are shown in 
Figure 3 and Figure 4, respectively. The asterisk at position 1902 in SEQ ID 
NO:5 indicates the position of the opal termination codon in the coding region of 
the nonstructural polyprotein. The extra nucleotides relative to AR339 were in the 
nonconserved half of nsP3, which contained insertions totalling 15 nucleotides, and 
in the 3 ' untranslated region which contained two 1-nucleotide deletions and a 1- 
nucleotide insertion with respect to the consensus Sindbis AR339 genome. The 
insertions found in the nsP3 gene of Girdwood S.A. were identical in position and 
content to those found, in S.A.AR86, although Girdwood S.A. did not have the 
large nsP3 deletion characteristic of S.A.AR86. The remaining portions of the 
genome contained the same number of nucleotides and predicted amino acids as 
Sindbis AR339. 

Overall, Girdwood S.A. was 94.5% identical to the consensus 
Sindbis AR339 sequence, differing at 655 nucleotides not including the insertions 
and deletions. These nucleotide differences resulted in 88 predicted amino acid 
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changes or a difference of 2.3%. A plurality of amino acid differences were 
concentrated in the nsP3 gene, which contained 32 of the amino acid changes, 25 
of which were in the nonconserved 3' half. 

The Girdwood S.A. nucleotides at positions 1, 3, and 11,717 could 
not be resolved. Because the primer used during the RT-PCR amplification of the 
3' end of the genome assumed a cytosine in the 3' terminal position, the identity 
of this nucleotide could not be determined with certainty. However, in all 
alphaviruses sequenced to date there is a cytosine in this position. This, combined 
with the fact that no difficulty was encountered in obtaining RT-PCR product for 
this region with an oligo(dT) primer ending with a 3'G, suggested that Girdwood 
S.A. also contains a cytosine at this position. The ambiguity at nucleotide 
positions 1 and 3 resulted from strong stops encountered during the RNA 
sequencing. 

EXAMPLE 3 
Comparison of S.A- AR86 and Girdwood S.A. 
Sequences With Other Sindbis-Related Virus Seq upnrpc 

Table 1 examines the relationship of S.A.AR86 and Girdwood S.A. 
to each other and to other Sindbis-related viruses. This was accomplished by 
aligning the nucleotide and .deduced amino acid sequences of Ockelbo82, AR339 
and Girdwood S.A. to those of S.A.AR86 and then calculating the percentage 
identity for each gene using the programs contained within the Wisconsin GCG 
package (Genetics Computer Group, 575 Science Drive, Madison WI 53711); as 
described in more detail in McKnight et al., 7. Virol 70, 1981-89 (1996). 

The analysis suggests that S.A.AR86 is most similar to the other 
South African isolate, Girdwood S.A., and that the South African isolates are more 
similar to the "Swedish Ockelbo82 isolate than to the Egyptian Sindbis AR339 
isolate. These results also suggest that it is unlikely that S.A.AR86 is a 
recombinant virus like WEE virus. Hahn et al. f Proc. Natl. Acad. ScL USA 85, 
5997-6001 (1988). 
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EXAMPLE4 
Neurovirulen ce of S.A.AK.86 and ftirHwnnH <: a 
Girdwood S.A. , Ockelbo82, and S.A.AR86 are related by.sequence; 
in contrast, it has previously been reported that only S.A.AR86 displayed the adult 
mouse neurovirulence phenotype. Russell et al., /. Virol. 63, 1619-29 (1989). 
These findings were confirmed by the present investigations. Briefly, groups of 
four female CD-I mice (3-6 weeks of age) were inoculated ic with Iff plaque- 
forming units (PFU) of S.A.AR86, Girdwood S.A., or Ockelbo82. Neither 
Girdwood S.A. nor Ockelbo82 infection produced any clinical signs of infection. 
Infection with S.A.AR86 produced neurological signs within four to five days and 
ultimately killed 100% of the mice as previously demonstrated. 

Table 2 lists those amino acids of S.A.AR86 which might explain 
the neurovirulence phenotype in adult mice. A position was scored as potentially 
related to the S.A.AR86 adult neurovirulence phenotype if the S.A.AR86 amino 
acid differed from that which otherwise was absolutely conserved at that position 
in the other viruses. 

TABLE 2 
Divergent Amino Acids in S.A.AR86 





Position in 


S.A.AR86 


Conserved 




S.A.AR86 


Amino Acid 


Amino Acid 


nsP1 


583 


Thr 


lie 


nsP2 


256 


Arg 


Ala 




648 


lie 


Val 




651 


Lys 


Glu 


nsP3 


344 


Giy 


GIu 




386 


Tyr 


Ser 




441 


Asp 


Gly 




445 


He 


Met 




537 


Cys 


Opal 


E2 


243 


Ser - 


Leu 


6K 


30 


Val 


lie 


El 


112 


Va! 


Ata 




169 


Leu 


Ser 
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EXAMPLE 5 
PS55 Molecular Clone of S.A.AR86 
As a first step in investigating the unique adult mouse 
neurovirulence phenotype of S.A.AR86, a full-length cDNA clone of the 
5 S.A.AR86 genome was constructed. The sources of cDNA included conventional 
cDNA clones (Davis et al., Virology 171, 189-204 (1989)) as well as uncloned 
RT-PCR fragments derived from the S .A.AR86 genome. As described previously, 
these were substituted, starting at the 3 ' end, into pTR5000 (McKnight et al. t /. 
ViroL 70, 1981-89 (1996)), a full-length Sindbis clone from which infectious 
• 10 genomic replicas could be derived by transcription with SP6 polymerase in vitro. 

The end result was pS55, a molecular clone of S.A.AR86 from 
which infectious transcripts could be produced and which contained four nucleotide 
changes (G for A at nt 215; G for C at nt 3863; G for A at nt 5984; and C for T 
at nt 9113) but no amino acid coding differences with respect to the S.A.AR86 
15 genomic RNA (amino acid sequence of S.A.AR86 presented in Figure 2 (SEQ ID 
NO:2 and SEQ ID NO:3)). The nucleotide sequence of clone pS55 is presented 
in Figure 5 (SEQ ID NO:7). 

As has been described by Simpson et al. f Virology 222, 464-69 
(1996), neurovirulence and replication of the virus derived from pS55 (S55) were 

20 compared with those of S.A.AR86. It was found that S55 exhibits the distinctive 
adult neurovirulence characteristic of S.A.AR86. Like S.A.AR86, S55 produces 
100% mortality in adult mice infected with the virus and the survival times of 
animals infected 1 with both viruses were indistinguishable. In addition, S55 and 
S.A.AR86 were found to replicate to essentially equivalent titers in vivo, and the 

25 profiles of S55 and S.A.AR86 virus growth in the central nervous system and 
periphery were very similar. 

From these data it was concluded that the silent changes found in 
virus derived from clone pS55 had little or no effect on its growth or virulence, 
and that this molecularly cloned virus accurately represents the biological isolate, 
30 S.A.AR86. 
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EXAMPLE 6 
Construction of the Consensus AR339 Virus TR33Q 
Tlie consensus sequence of the Sindbis virus AR339 isolate, the 
prototype alphaviius was deduced. The consensus AR339 sequence was inferred 
5 by comparison of the TRSB sequence (a laboratory-derived AR339 strain) with the 
complete or partial sequences of HR,p (the Gen Bank sequence; Strauss et al., 
Virology 133, 92-110 (1984)), SV1A, and NSV (AR339-derived laboratory strains; 
Lustig et al., /. Virol 62, 2329-36 (1988)), and SIN (a laboratory-derived AR339 
strain; Davis et al., Virology 161, 101-108 (1987), Strauss et al. t /. Virol 65, 
10 4654-64(1991)). Each of these viruses was descended from AR339. Where these 
sequences differed from each other, they also were compared with the amino acid 
sequences of other viruses related to Sindbis virus: Ockelbo82, S.A.AR86, 
Girdwood S.A., and the somewhat more distantly related Aura virus. Rumenapf 
et al., Virology 208, 621-33 (1995). 

15 The details of determining a consensus AR339 sequence and 

constructing the consensus virus TR339 have been described elsewhere. McKnight 
et al., 7. Virol 70, 1981-89 (1996); Klimstra et al. f manuscript in preparation. 
The nucleotide (SEQ ID NO:8) sequence of pTR339 is presented in Figure 6. 
The deduced amino acid sequences of the pTEl339 non-structural and structural 

20 polyproteins are shown as SEQ ID NO:9 and SEQ ID NO:10, respectively. The 
asterisk at position 1897 in SEQ ID NO:9 indicates the position of the opal 
termination codon in the coding region of the nonstructural polyprotein. The 
consensus nucleotide sequence diverged from the pTRSB sequence at three coding 
positions (nsP3:528, E2 1, and El 72). These differences are illustrated in Table 

25 3. 



TABLE 3 

Amino Acid Differences Between 
Laboratory Strain TRSB and Molecular Clone TR339 





nsP3 528 (nt5683) 


E2 1 (nt8633) 


E1 72 (nt10279) 


TR339 


Arg (CGA) 


Ser (AGO 


Ala (GCU) 


TRSB 


Gin (CAA) 


Arg (AGA) 


Val (GUU) 
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EXAMPLE 7 
Animals Used for In Vivo Localization Studies 
Specific pathogen free CD-I mice were obtained from Charles River 
Breeding Laboratories (Raleigh, North Carolina) at 21 days of age and maintained 
under barrier conditions until approximately 37 days of age. Intracerebral (ic) 
inoculations were performed as previously described, Simpson et al., Virol. 222, 
464-49 (1996), with 500 PFU of S51 (an attenuated mutant of S55) or 10 3 PFU of 
S55. Animals inoculated peripherally were first anesthetized with METOFANE* 
Then, 25 p\ of diluent (PBS, pH 7.2, 1% donor calf serum, 100 u/ml penicillin, 
50 /ig/ml streptomycin, 0.9 mM CaCl 2 , and 0.5 mM MgCy containing 10 3 PFU 
of virus were injected either intravenously (iv) into the tail vein, subcutaneously 
(sc) into the skin above the shoulder blades on the middle of the back, or 
intraperitoneally (ip) in the lower right abdomen. Animals were sacrificed at 
various times post-inoculation as previously described. Simpson et al. , ViroL 222, 
464-49 (1996). Brains (including brainstems) were homogenized in diluent to 30% 
w/v, and right quadriceps were homogenized in diluent to 25 % w/v. Homogenates 
were handled and titered as described previously. Simpson et al. , ViroL 222, 464- 
49 (1996). Bone marrow was harvested by crushing both femurs from each animal 
in sufficient diluent to produce a 30% w/v suspension (calculated as weight of 
uncrushed femurs in volume of diluent). Samples were stored at -70°C. For 
titration, samples were thawed and clarified by centrifugation at 1,000 x g for 20 
minutes at 4°C before being titered by conventional plaque assay on BHK-21 cells. 

EXAMPLE 8 
Tissue Preparation for In Situ Hybridization Studies 
Animals were anesthetized by ip injection of 0.5 ml AVERTIN* at 
various times post-inoculation followed by perfusion with 60 to 75 ml of 4% 
paraformaldehyde in PBS (pH 7.2) at a flow rate of 10 ml per minute. The entire 
carcass was decalcified for 8 to 10 weeks in 4% parafomaldehyde containing 8% 
EDTA in PBS (pH 6.8) at 4°C. This solution was changed twice during the 
decalcification period. Selected tissues were cut into blocks approximately 3 mm 
thick and placed into biopsy cassettes for paraffin embedding and sectioning. 
Blocks .were embedded, sectioned and hematoxylin/eosin stained by Experimental 
Pathology Laboratories (Research Triangle Park, North Carolina) or North 
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Carolina Scate University Veterinary School Pathology Laboratory (Raleigh, North 
Carolina). 

EXAMPLE 9 
In Situ Hybridization 
Hybridizations were performed using a [ 35 S]-UTP labeled S. A. AR86 
specific riboprobe derived from pDS-45. Clone pDS-45 was constructed by first 
amplifying a 707 base pair fragment from pS55 by PCR using primers 7241 (5'- 
CTGCGGCGGATTC ATCTTGC-3 ' , SEQ ID NO:ll) and SC-3 (5'- 
CTCC AACTTAAGTG-3 ' , SEQ ID NO:12). The resulting 707 base pair fragment 
was purified using a GENE CLEAN® kit (BiolOl, CA), digested with HhdL, and 
cloned into the Smal site of pSP72 (Promega). Linearizing pDS-45 with EcoRV 
and performing an in vitro transcription reaction with SP6 DNA-dependent, RNA 
polymerase (Promega) in the presence of p 5 S]-UTP resulted in a riboprobe 
approximately 500 nucleotides in length of which 445 nucleotides were 
complementary to the S.A.AR86 genome (nucleotides 7371 through 7816). A 
riboprobe specific for the influenza strain PR-8 hemagglutinin (HA) gene was used 
as a control probe to test non-specific binding. The in situ hybridizations were 
performed as described previously (Charles et al. , Virol 208, 662-71 (1995)) using 
10 5 cpm of probe per slide. 

EXAMPLE 10 
Replication of S.A.AR86 in Bone Marrow 
Three groups of six adult mice each were inoculated peripherally 
(sc, ip, or iv) with 1200 PFU of S55 (a molecular clone of S.A.AR86) in 25 yX 
of diluent. Under these conditions, the infection produced no morbidity or 
mortality. Two mice from each group were anesthetized and sacrificed at 2 f 4 and 
6 days post-inoculation by exsanguination. The serum, brain (including 
brainstem), right quadricep, and both femurs were harvested and titered by plaque 
assay. Virus was never detected in the quadricep samples of animals inoculated 
sc (Table 4). A single animal inoculated ip (two days post-inoculation) and two 
mice inoculated iv (at four and six days post-inoculation) had detectable virus in 
the right quadricep, but the titer was at or just above the limit of detection (6.25 
PFU/e tissue). Virus was present sporadically or at low levels in the brain and 
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senim of animals regardless of the route of inoculation. Virus was detected in the 
bone marrow of animals regardless of the route of inoculation. However, the 
presence of virus in bone marrow of animals inoculated sc or ip was more sporadic 
than animals inoculated iv, where five out of six animals had detectable vims. 
5 These results suggest that S55 targets to the bone marrow, especially following iv 
inoculation. 

The level and frequency of virus detected in the serum and muscle 
suggested that virus detected in the bone marrow was not residual virus 
contamination from blood or connective tissue remaining in bone marrow samples. 

10 The following experiment also suggested that virus in bone marrow was not due 
to tissue or serum contamination. Mice were inoculated ic with 1200 PFU of S55 
in 25 /xl of diluent- Animals were sacrificed at 0.25, 0.5, 1, 1.5, 2, 3, 4, 5, and 
6 days post-inoculation, and the carcasses were decalcified as described in 
Example 8. Coronal sections taken at approximately 3 mm intervals through the 

15 head, spine (including shoulder area), and hips were probed with an S55-specific 
[ 33 S]-UTP labeled riboprobe derived from pDS-45. Positive in situ hybridization 
signal was detected by one day post-inoculation in the bone marrow of the skull 
(data not shown). Weak signal also was present in some of the chondrocytes of 
the vertebrae, suggesting that S55 was replicating in these cells as well. Although 

20 the frequency of positive bone marrow cells was low, the signal was very intense 
over individual positive cells. This result strongly suggests that S55 replicates in 
vivo in a subset of cells contained in the bone marrow. 

EXAMPLE 11 
Other Sindbis Group Viruses 
25 It was of interest to determine if the ability to replicate in the bone 

marrow of mice was unique to S55 or was a general feature of other viruses, both 
Sindbis and non-Sindbis viruses, in the Sindbis group. Six 38-day-old female CD- 
1 mice were inoculated iv with 25 /xl of diluent containing 10 3 PFU of S55, 
Ockelbo82, Girdwood S.A., TR339, or TRSB. At 2, 4 and 6 days post- 
30 inoculation two mice from each group were sacrificed and whole blood, serum, 
brain (including brainstem), right quadricep, and both femurs were harvested for 
virus titration. 



SUBSTITUTE SHEET (RULE 26) 



WO 98/36779 PCTYUS98/02945 

-43- 

The results of this experiment were similar to those with S55. 
TRSB infected animals had no virus detectable in serum or whole blood in any 
animal at any time, and with the other viruses tested, no virus was detected in the 
serum or whole blood of any animal beyond two days post-inoculation (detection 
5 limit, 25 PFU/ml). Neither TRSB nor TR339 was detectable in the brains of 
infected animals at any time post-inoculation. S55; Girdwood S.A., and 
Ockelbo82 were present in the brains of infected animals sporadically with the 
titers being at or near the 75 PFU/g level of detection. All the tested viruses were 
found sporadically at or slightly above the 50 PFU/g detection limit in the right 
10 quadricep of infected animals except for a single animal four days post-inoculation 
with TRSB which had nearly 10 s PFU/g of virus in its quadricep. 

The frequency at which the different viruses were detected in bone 
marrow varied widely, with S55 and Girdwood S.A. being the most frequently 
isolated (five out of six animals) and Ockelbo82 and TRSB being the least 
15 frequently isolated from bone marrow (one out of six animals and two out of six 
animals, respectively) (Table 4). Girdwood S.A. and S55 gave nearly identical 
profiles in all tissues. Girdwood S.A., unlike S.A.AR86, is not neurovirulent in 
adult mice (Example 4), suggesting that the adult neurovirulence phenotype is 
distinct from the ability of the virus to replicate efficiently in bone marrow. 
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EXAMPLE 12 
Virus Persistence in Bone Marrow 
The next step in our investigations was to evaluate the possibility 
that S.A.AR86 persisted long-term in bone marrow. S51 is a molecularly cloned, 
attenuated mutant of S55. S51 differs from S55 by a threonine for isoleucine 
5 substitution at amino acid residue 538 of nsPl and is attenuated in adult mice 
inoculated intracerebrally. Like S55, S51 targeted to and replicated in the bone 
marrow of 37-day-old female CD-I mice following ic inoculation. Mice were 
inoculated ic with 500 PFU of S51 and sacrificed at 4, 8, 16, and 30 days post- 
inoculation for determination of bone marrow and serum titers. At no time post- 
10 inoculation was virus detected in the serum above the 6.25 PFU/ml detection limit. 
Virus was detectable in the bone marrow samples of both animals sampled at four 
days post-inoculation and in one animal eight days post-inoculation (Table 5). No 
virus was detectable by titration on BHK-21 cells in any of the bone marrow 
samples beyond eight days post-inoculation. These results suggested that the 
15 attenuating mutation present in S51, whichreduces the neurovirulence of the virus, 
did not impair acute viral replication in the bone marrow. 

It was notable that the plaque size on BHK-21 cells of virus 
recovered on day 4 post-inoculation was smaller than the size of plaques produced 
by the inoculum virus, and that plaques produced from virus recovered from the 
20 day 8 post-inoculation samples were even smaller and barely visible. This 
suggests a strong selective pressure in the bone marrow for virus that is much less 
efficient 'in forming plaques on BHK-21 cells. 



To demonstrate that S51 virus genomes were present in bone 
marrow cells long after acute infection, four to six-week-old female CD-I mice 

25 were inoculated ic with 500 PFU of S51. Three months post-inoculation two 
animals were sacrificed, perfused with paraformaldehyde and decalcified as 
described in Example 8. The heads and hind limbs from these animals were 
paraffin embedded, sectioned, and probed with a S.A.AR86 specific p 5 S]-UTP 
labeled riboprobe derived from clone pDS-45. In situ hybridization signal was 

30 clearly present in discrete cells of the bone and bone marrow of the legs (data not 
shown). Furthermore, no in situ hybridization signal was detected in an adjacent 
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control section probed with an influenza virus HA gene specific riboprobe. As the 
relative sensitivity of in situ hybridization is reduced in decalcified tissues (Peter 
Charles, personal communication), these cells likely contain a relatively high 
number of viral sequences, even at three months post-inoculation. No in situ 
5 hybridization signal was observed in mid-sagital sections of the heads with the 
S.A.AR86 specific probe, although focal lesions were observed in the brain 
indicative of the prior acute infection with S51. 



TABLES 



S51 Titers in 


Bone Marrow Followins IC Inoculation nf snn pctt 


Days Post- 
Inoculation 


Titers (Total PFU/Animan 


Limit of 
Detection 


Animal A 


Animal B 


4 


2100 


380 


62.5 


8 


62.5 


N.D.' 


62.5 


16 


N.D. 


N.D. 


62.5 


30 


N.D. 


N.D. 


62.5 



N.D. " indicates that the virus titers were below the limit of detection. 
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Example 13 

Replication of S. A ,A.R86 within Bone/Joint Tissue of Adult Mice 

Several old world alphavimses, including Ross River Virus, 
Chikungunya vims, Okelbo82 f and S.A.AR86 are associated with acute and persistent 
5 arthritis/arthralgia in humans. Molecular clones of several Sindbis group viruses, 
including S.A.AR86,. were used to investigate alphavirus replication within bone/joint 
tissue. 

Following intravenous inoculation of S.A.AR86 into adult CD-I mice, 
viral replication was observed in bone/joint tissue, but not surrounding muscle tissue of 

10 the hind limbs. Infectious virus was detectable 24 hrs post-infection; however, viral 
titer within bone/joint tissue was maximal 72 hours post-infection. Fractionation of 
hind limbs from infected animals revealed that the hip and knee joints were the 
predominant sites of viral replication. Replication within bone/joint tissue appears to be 
a common trait of Sindbis-group viruses, since the laboratory strains TR339 and TRSB 

15 also replicated within bone/joint tissue. In situ hybridization and S.A.AR86 based 
double promoter vectors expressing green fluorescent protein were used to further 
localize S.A.AR86 infected cells within bone/joint tissue. Green fluorescent protein 
expression was detected in bone/joint tissue for at least one month post-inoculation. 
These studies demonstrated that cells within the endosteum of synovial joints were the 

20 predominant site of S.AAR86 replication. 
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THAT WHICH IS CLAIMED IS: 

1- A inethod of introducing and expressing heterologous RNA 
in bone marrow cells, comprising: 

(a) providing a recombinant alphavirus, said alphavirus 
containing a heterologous RNA segment, said heterologous RNA segment 

5 comprising a promoter operable in said bone marrow cells operatively associated 
with a heterologous RNA to be expressed in said bone marrow cells; and then 

(b) contacting said recombinant alphavirus to said bone marrow 
cells so that said heterologous RNA segment is introduced and expressed therein. 

2. A method according to claim 1, wherein said contacting step 
10 is carried out in vitro. 

3 . A method according to claim 1 , wherein said contacting step 
is carried out in vivo in a subject in need of such treatment. 

4. A method according to claim 1, wherein said heterologous 
RNA encodes a protein or peptide. 

15 5. A method according to claim 1, wherein said heterologous 

RNA encodes an immunogenic protein or peptide. 

6. A method according to claim 1, wherein said heterologous 
RNA encodes an antisense oligonucleotide or a ribozyme. 

7. A method according to claim 1, wherein said alphavirus is 
20 an Old World alphavirus. 

8. A method according to claim 1, wherein said alphavirus is 
selected from the group consisting of SF group and SIN group alphavinises. 



SUBSTITUTE SHEET (RULE 26) 



WO 98/36779 PCT/US98/02945 

-51- 

9. A method according to claim 1, wherein said alphavirus is 
selected from the group consisting of Semliki Forest virus, Middelburg virus, 
Chikungunya virus, O'Nyong-Nyong virus, Ross River virus, Bannah Forest 
virus, Getah virus, Sagiyama virus, Bebaru virus, Mayaro virus, Una virus, 

5 Sindbis virus, South African Arbovirus No. 86, Ockelbo virus, Girdwood S.A. 
virus, Aura virus, Whataroa virus, Babanki virus, and Kyzylagach virus. 

10. A method according to claim 1, wherein said alphavirus is 
South African Arbovirus No. 86. 

11. A method according to claim 1, wherein said alphavirus is 
10 Girdwood S.A. 

12. A method according to claim 1, wherein said alphavirus is 
Sindbis strain TR339. 



13. A helper cell for expressing an infectious, propagation 
defective, Girdwood S.A. virus particle, comprising, in a Girdwood S.A.- 
15 permissive cell: 

(a) a first helper RNA encoding (i) at least one Girdwood S.A. 
structural protein, and (ii) not encoding at least one other Girdwood S.A. structural 
protein; and 

(b) a second helper RNA separate from said first helper RNA, 
20 said second helper RNA CO not encoding said at least one Girdwood S.A.. 

structural protein encoded by said first helper RNA, and pi) encoding said at least 
one other Girdwood S.A. structural protein not encoded by said first helper RNA, 
and with all of said Girdwood S.A. structural proteins encoded by said first and 
second helper RNAs assembling together into Girdwood S.A. particles in said cell 
25 containing said replicon RNA; 

and wherein the Girdwood S.A. packaging segment is deleted from 
at least said first helper RNA. 
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14. The helper cell according to claim 13, further containing a 

replicon RNA; 

said replicon RNA encoding said Girdwood S.A. packaging segment 
and an inserted heterologous RNA; 

wherein said Girdwood S.A. packaging segment is deleted from at 
least one of said helper RNA; 

and wherein said replicon RNA, said first helper RNA, and said 
second helper RNA are all separate molecules from one another. 

15. The helper cell according to claim 13, further containing a 

replicon RNA; 

said replicon RNA encoding said Girdwood S.A. packaging segment 
and an inserted heterologous RNA; 

wherein said replicon RNA and said first helper RNA are separate 

molecules; 

and wherein the molecule containing said replicon RNA further 
contains RNA encoding said at least one Girdwood S.A. structural protein not 
encoded by said first, helper RNA. 

16. The helper cell according to claim 13, wherein said first 
helper RNA encodes both the Girdwood S.A. El glycoprotein and the Girdwood 
S.A. E2 glycoprotein, and wherein said second helper RNA encodes the Girdwood 
S.A. capsid protein. 

17. A method of making infectious, propagation defective, 
Girdwood S.A. virus particles, comprising: 

transfecting a Girdwood S. A.-permissive cell according to claim 13 
with a propagation defective replicon RNA, said replicon RNA including said 
Girdwood S.A. packaging segment and an inserted heterologous RNA; 

producing said Girdwood S.A. vims particles in said transfected 

cell; and then 

collecting said Girdwood S.A. virus particles from said cell. 
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18. Infectious Girdwood S.A. virus panicles produced by the 
method of Claim 17. 



19. Infectious Girdwood S.A. virus particles regaining a 
replicon RNA encoding a promoter, an inserted heterologous RNA, and wherein 

5 RNA encoding at least one Girdwood S.A. structural protein is deleted therefrom 
so that said Girdwood S.A. virus particle is propagation defective. 

20. A pharmaceutical formulation comprising infectious 
Girdwood S.A. virus particles according to claim 18 or 19 in a pharmaceutical^ 
acceptable carrier. 

10 21. A helper cell for expressing an infectious, propagation 

defective, TR339 virus particle, comprising, in a TR339-pennissive cell: 

(a) a first helper RNA encoding (i) at least one TR339 structural 
protein, and (it) not encoding at least one other TR339 structural protein; and 

(b) a second helper RNA separate from said first helper RNA, 
15 said second helper RNA (i) not encoding said at least one TR339 structural protein 

encoded by said first helper RNA, and (ii) encoding said at least one other TR339 
structural protein not encoded by said first helper RNA, and with all of said 
TR339 structural proteins encoded by said first and second helper RNAs 
assembling together into TR339 particles in said cell containing said replicon 
20 RNA; 

and wherein the TR339 packaging segment is deleted from at least 
said first helper RNA. 

22. The helper cell according to claim 21, further containing a 

replicon RNA; 

25 said replicon RNA encoding said TR339 packaging segment and an 

inserted heterologous RNA; 

wherein said TR339 packaging segment is deleted from at least one 
of said helper RNA; 

and wherein said replicon RNA, said first helper RNA, and said 
30 second helper RNA are all separate molecules from one another. 



SUBSTITUTE SHEET (RULE 26) 



WO 98/36779 PCT/US98/02945 

-54- 

23. The helper cell according to claim 21, further confining a 

replicon RNA; 

said replicon RNA encoding said TR339 packaging segment and an 
inserted heterologous RNA; 

5 wherein said replicon RNA and said first helper RNA are separate 

molecules; 

and wherein the molecule containing said replicon RNA further 
contains RNA encoding said at least one TR339 structural protein not encoded by 
said first helper RNA. 

10 24. The helper cell according to claim 21, wherein said first 

helper RNA encodes both the TR339 El glycoprotein and the TR339 E2 
glycoprotein, and wherein said second helper RNA encodes the TR339 capsid 
protein. 

25. A method of making infectious, propagation defective, 
15 TR339 virus particles, comprising: 

transfecting a TR339-pennissive cell according to claim 21 with a 
propagation defective replicon RNA, said replicon RNA including said TR339 
packaging segment and an inserted heterologous RNA; 

producing said TR339 virus particles in said transfected cell; and 

20 then 

collecting said TR339 virus particles from said cell. 

26. Infectious TR339 virus panicles produced by the method of 

Claim 25. 

27. Infectious TR339 virus particles containing a replicon RNA 
25 encoding a promoter, an inserted heterologous RNA, and wherein RNA encoding 

at least one TR339 structural protein is deleted therefrom so that said virus particle 
is propagation defective. 

28. A pharmaceutical formulation comprising infectious TR339 
virus particles according to Claim 26 or 27 in a pharmaceutical^ acceptable carrier. 
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29. A recombinant DNA comprising a cDNA coding for an 
infectious Girdwood S.A. virus RNA transcript and a heterologous promoter 
positioned upstream from said cDNA and operatively associated therewith. 

30. An infectious RNA transcript encoded by a cDNA according 

to claim 29. 

31. An infectious RNA according to claim 30, said infectious 
Girdwood S.A. RNA transcript containing a heterologous RNA segment, said 
heterologous RNA segment comprising a promoter operably associated with a 
heterologous RNA. 

32. Infectious viral particles containing an RNA transcript 
according to claim 30. 

33. A recombinant DNA comprising a cDNA coding for a 
Sindbis strain TR339 RNA transcript and a heterologous promoter positioned 
upstream from said cDNA and operatively associated therewith. 

34. An infectious RNA transcript encoded by a cDNA according 

to claim 33. 

35. An infectious RNA according to claim 34, said infectious 
Girdwood S.A. RNA transcript containing a heterologous RNA segment, said 
heterologous RNA segment comprising a promoter operably associated with a 
heterologous RNA. 

36. Infectious . viral particles containing an RNA transcript 
according to claim 34. 



SUBSTITUTE SHEET (RULE 26) 



WO 98/36779 



1/12 



PCT/US98/02945 



Nucleotide Sequence of S.A.AR86 



I ATTCCCCCCQ TAGTACACAC tattoaatca aacaccccac caattgcact accatcacaa tccacaaccc agtagttaac otagacgtag accctcagao 



101 11X01 MGIt CTGCAACrCC aaaaoacctt cococaattt cacctaqtao cacaccacgt cactccaaat oaccatccta atcccaoac c attttcccat 

201 CTGGCCAGTA AACTAATCGA GCTGGAGGTT CCTACCACAO CQACQATTTT CQACATACCC ACCCCACCCC CTCCTACAAT CTTTTOCGAO CACCAOTACC 
301 ATTCCGTTTO CCCCATCCOT ACTCCAGAAO ACCCGCACCa CATOATOAAA TATCCCAOCA AACTCCOGCA AAAACCATOT AAQATTACAA ACAAQAACTT 
401 CCATOAGAAG ATCAACCACC TCCGGACCGT ACTTOATACA CCCCATCCTC AAACCCCATC ACTCTCCTTC CACAACCATG TTACCTCCAA C A COCO I UX 
501 OACTACTC CG TCATOCACCA CGTOTACATC AACC C T CCCO GAACTATTTA C C A CC ACC C T ATCAAACCCO TCCCOA C CCT GTACTCCATT CCCTTCOACA 
601 CCACCCACTT CATCTTCTCC CCTATCCCAC CTTCCTACCC TCCATACAAC ACCAACTCCC CCCACGAAAA AGTCCTTOAA CCCCCTAACA TCCQACTCTC 
701 CAGCACAAAG CTCACTGAAO CCAGGACACG AAACTTCTCO ATAATCACCA ACAACCACTT OAACCCCCCG TCACCCGTTT atttctccct TCOATCCACA 
S01 CTTTACCCAO AACACAOACC CACCTTCCAC AGCTCCCATC TTCCATCCCT CTTCCACTTC AAACCAAACC AGTCCTACAC TTCCCCCTCT catacactcc 
SOI TCACCTCCCA ACCCTACCTA CTOAAGAAAA TCACCATCAG TCCCGCGATC ACCGOAGAAA COGTCCGATA CGCGGTTACA AACAATAGCG ACGGCTTCTT 
1001 GCTATCCAAA GTTACCOATA CAGTAAAAGG AGAACGGGTA ICUIUXLU TOTCCACCTA TATCCCGGCC ACCATATGCG ATCAGATGAC CCGCATAATG 
1 101 OCCACGGATA TCTCACCTGA CGATGCACAA AAACTTCTGG TTOGGCTCAA CCAGCGAATC GTCATTAACG GTAAGACTAA CAGGAACACC AATACCATCC 
1201 AAAATTACCT TCTCCCAATC ATTGCACAAG GGTTCAGCAA ATGGGCCAAO GAGCGCAAAG AAGATCTTGA CAATCAAAAA ATGCTGGGCA CCAGAGAGCG 
1301 CAAGCTTACA TATGGCTGCT TGTGCGCGTT TCGCACTAAG AAAGTGCACT CGTTCTATCG CCCACCTGGA ACGCAGACCA TCGTAAAAGT CCCAGCCTCT 
1401 TTTAGCGCTT TCCCCATGTC ATCCCTATCO ACTACCTCTT TCCCCATGTC GCTCAGGCAG AAGATGAAAT TCGCATTACA ACCAAAGAAG GAGGAAAAAC 
1301 TCCTCCAAGT CCCGGAGGAA TTAGTTATGG ACGCCAAGGC TCCTTTCGAG GATGCTCAGG AGGAATCCAG AGCCGAGAAO CTCCGACAAG CACTCCCACC 

1601 ATTAGTGGCA GACAAAGGTA TCGACGCAGC TGCGGAAGTT GTCTGCGAAG tggaggggct ccaggcggac accggagcag cactcgtcga aacc ccocg c 
1701 ggtcatgtaa ggataatacc tcaagcaaat gaccgtatga tcggacagta tatcgttgtc tcgcccatct ctgtgctgaa gaacgctaaa CTCCCACCAO 
I soi cacacccgct aggagaccag gttaagatca taacgcactc cggaagatca goaaggtatg cagtcgaacc atacgacgct aaagtactga tgccagcago 
1901 aagtgccgta ccatggccag aattcttagc actgagtgag agcgccacgc ttgtotacaa cgaaagagag tttctgaacc ccaacctcta ccatattgcc 
2X1 atccacggtc ccqctaaoaa tacaoaagao gagcagtaca aggttacaaa ggcagagctc GCAGAAACAG ACTACCTCTT tgacgtgoac aaoaagogat 
1101 gcgttaagaa ggaagaagcc tcaggactto tcctttccoo agaactgacc aaccccccct atcaggaact agctcttgag ggactgaaga ctcgacccgc 
2201 ggtcccgtac aaggtfgaaa caataggagt gataggcaca ccaggatcgg gcaagtcacc tatcatcaag tcaactctca cggcacgtga tcttottacc 
2301 agcggaaaga aagaaaactc ccgcgaaatt gagggcgaco tgctacggct gaggggcatg cagatcacgt cgaagacagt ggattcggtt atgctcaacg 
2401 gatgccacaa agccgtagaa gtgctgtatg ttgacgaagc gttccggtgc cacgcaggag cactacttgc cttgattgca atcgtcagac cccgtaagaa 
2501 ggtagtacta tgcgoagacc ctaaccaatg cggattcttc aacatgatgc aactaaaggt acatttcaac caccctgaaa aagacatatg taccaaoaca 
:ai ttctacaagt ttatctcccg accttgcaca cagccagtca cggctattgt atcgacactg cattacgatg gaaaaatgaa AACCACAAAC CCGTGCAAGA 

2701 AGAACATCGA AATCCACATT ACAGGGGCCA CGAAGCCGAA GCCAGGCGAC ATCATCCTGA CATGTTTCCG CGGGTGGGTT AAGCAACTGC AAATCCACTA 
ISOt T CCCG GACAT GAGGTAATGA CAGCCGCGGC CTCACAAGGO CTAACCAGAA AAGGAGTATA TGCCGTCCGG CAAAAAGTCA ATGAAAACCC CCTGTACGCC 
2901 atcacatcao AGCATGTGAA CGTGTTGCTC ACCCGCACTG AGGACAGCCT AGTATCGAAA ACTTTACAGC GCGACCCATG GATTAAGCAG CTCACTAACO 
3001 TACCTAAAGG AAA 1 1 1 1 LA C GCCACCATCG AGGACTGGGA AGCTGAACAC AAGGGAATAA TTGCTCCGAT AAACAGTCCC CCTCCCCGTA CCAATCCOTT 
3101 CAGCTGCAAO ACTAACGTTT GCTCGGCOAA AGCACTGGAA CCGATACTGG CCACGGCCGG TATCGTACTT ACCCGTTCCC AGTGGAGCGA GCTGTTCCCA 
3201 CAGTTTCCGG ATGACAAACC ACACTCCGCC ATCTACCCCT TAGACGTAAT TTGCATTAAG TTTTTCCGCA TCGACTTGAC AAGCGGGCTG TTTTCCAAAC 
3301 agagcatccc GTTAACCTAC CATCCTCCCC ACTCAGCGAO CCCAGTAGCT CATTGGGACA acagcccacg aacacgcaag TATCGGTACC ATCACCCCGT 
3401 TCCCGCCCAA CTCTCCCGTA GATTTCCCCT GTTCCACCTA GCTGGGAAAG GCACACAGCT TGATTTGCAG ACGGGCAGAA CTAGAGTTAT CTCTGCACAG 
3501 CATAACTTGG TCCCAGTGAA CCGCAATCTC CCTCACCCCT TAGT CC C CGA GCACAAGGAG AAACAACCCG GCCCGGTCGA AAAATTCTTG aCCCAOTTCA 
3601 AACACCACTC CGTACTTCTG atctcagaga aaaaaattga agctccccac AAGAGAATCG AATGGATCGC cccgattggc ATAGCCGGCG CAGATAAGAA 

J701 CTACAACCTG GCTTTCGGGT TTCCGCCGCA GGCACGGTAC GACCTCCTCT TCATCAATAT TGGAACTAAA TACAGAAACC ATCACTTTCA ACAGTGCGAA 
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uoi gaccacgcgg cqaccttoaa aaccctttcc cgttcccccc tgaactgcct taacccccga cccaccctco tcotqaactc ctacggttac cocqacccca 

3901 ATACTOACCA eCTACTCACC CCTCTTCCCA GAAAATTTCT CAOAOTCTCT CCAGCOAOGC CAQAOTGCOT CTCAACCAAT ACAOAAATOT ACCTQATTTT 
4001 CCCACAACTA OACAACACCC CCACACCACA ATTCA CCCCC CATCATTTOA ATTCTOTCAT T T CO T CC CI O TACOACCCTA CAAOAGACGO AGTTGQAGCC 
4101 GCACCCTCCT ACCGTACTAA AACCGAOAAC AT TCCTGATT CTCA AC ACOA AGCAGTTGTC AATCCACCCA ATOCACTCCO CAGACCAGGA GAACCAOfCT 
4101 CCCCTCCCAT CTATAAACCT TGCCCCAACA CTTTCACCCA TTCACCCACA GAGACAGGTA CCCCAAAACT CACT C TCT CC CAAGCAAAGA AAOTGATCCA 
4301 CGCGGTTCGC CCTGATT7CC GGAAACACCC ACAGGCAGAA CCCCTGAAAT TGCTGCAAAA CCCCTACCAT CCAGTCGCAG ACTTAGTAAA tgaacataat 

4401 atcaagtctc tcc cc atccc actcctatct acaggcattt a cc cagc cg o AAAAOACCCC cttcaggtat cacttaacto cttgacaacc gcgctagaca 
4soi gaactgatgc ggacgtaacc atctactgcc tcgataagaa gtggaaggaa agaatcgacg lcg t gc t cc a acttaaggag tctgtaacto agctgaacoa 

4601 TGAGGATATO OAOATCGACO ACGACTTAGT ATGGATCCAT CCGGACAGTT GCCTGAAGGG AAGAAAGGGA TTCAGTACTA CAAAAGOAAA GTTGTATTCG 
4701 TACTTTCAAO GCACCAAATT CCATCAAGCA GCAAAACATA TGCCGGAGAT AAAGGT CCTC TTCCCAAATG ACCAGGAAAG CAACGAACAA CT OIGlULCr 
4801 ACATATTGGG GGAGACCATG GAAGCAATCC GCGAAAAATG CCCCGTCGAC CACAACCCGT C GTCT A GCCC GCCAAAAACO CTGCCGTGCC TCTCTATGTA 
4901 TCCCATGACG CCAGAAAGGG TCCACAOACT CAGAAGCAAT AACGTCAAAO AAGTTACAGT ATGCT CC TC C ACCCCCCTTC CAAAGTACAA AATCAAOAAT 
$001 GTTCACAAGO TTCAOTGCAC AAAAGTAGTC CTCTTTAACC CGCATACOCC CCCATTCCTT CCCG CC COT A AGTACATAQA AGCACCAOAA CAGOCnSCAG 
S101 CTCCGCCTGC ACAGGCCGAG GAGGCCCCCO GAGTTGTAGC OACACCAACA CCACCTGCAG CTGATAACAC CTCGCTTGAT CTCACGGACA TCTCACTGGA 
5101 CATGCAAGAC AGTAGCGAAG GCTCACTCTT TTCGAGCTTr AGOGGATCGG ACAACTACCO AAGGCAGGTG GTGGTGGCTG ACGTCCATCC COTCCAAOAO 
3301 C CT C CC CC TC TTCCACCGCC AAGGCTAAAG AAGATGGCCC GCCTGGCAGC GGCAAGAATG CAGGAAGAGC CAACTCCACC GGCAAGCACC AGCTCTCCGO 
S401 ACGAGTCCCT TCACCTTTCT tttcatgggg tatctatatc cttcggatcc CTTTTCGACG GAGAGATGGC ccgcttggca gcggcacaac ccccggcaag 

5501 TACATGCCCT ACGGATGTGC CTATGTCTTT CGGATCGTTT TCCOACGGAG ACATTGAGGA GTTGAGCCGC AGAGTAACCG AGTCGGAGCC LU1LL10UI 
3601 GGGTCATTTG AACCGGGCGA AGTGAACTCA ATTATATCGT CCCGATCAGC COTATCTTTT CCACCACGCA AGCAGAGACG TAGACGCAGG AGCAGOAOGA 
5701 CCGAATACTG TCTAACCGGG GTAGCTGGOT ACATATTTTC OACGGACACA GG CC CT GG GC ACTTGCAAAA GAAGTCCGTT CTGCAGAACC AGCTTACAGA 
5*01 ACCGACCTTC gagcgcaatg TTCTGGAAAG AATCTACGCC CCCGTGCTCG ACACOTCGAA AGAGOAACAO ctcaaactca ggtaccaoat g a t gc c ca cc 
5901 GAAGCCAACA AAAGGAGGTA CCAGTCTCGA AAAGTAGAAA ACCAOAAAGC CATAACCACT GAGCGACTGC TTTCAGGCCT ACGACTOTAT AACTCTGCCA 
6001 CAGATCAGCC AGAATGCTAT aagatcacct ACCCGAAACC ATCOTATTCC AGCAGTCTAC CaGCGAACTA ctctgaccca AAGTTTGCTC tacctcttto 
6101 TAACAACTAT CTGCATGAGA ATTACCCGAC GOTAGCATCT TATCAGATCA CCOACGAGTA CGATGCTTAC TTGGATATGG TAGACGGGAC AOTCCCTTGC 
6201 CTACATACTC CAA C I 1 1110 CCCCCCCAAG CTTAGAAGTT ACCCGAAAAG ACACGAGTAT AGAGCCCCAA ACATCCCCAG TGCGGTTCCA TCAGCOATGC 
6301 AGAACACGTT GCAAAACGTG CTC A TT OC C O CGACTAAAAO AAACTGCAAC GTCACACAAA TGCGTGAACT GCCAACACTG GACTCAGCGA CATTCAACGT 

6401 TGAATGCTTT CGAAAATATG CATGCAATGA CGAGTATTGO GAGGAGTTTG cccgaaagcc aattaggatc actactgagt TCGTTACCGC atacctcccc 
6501 AGACTGAAAG GCOCTAAGGC CGCCCCACTG TTCGCAAAGA CGCATAATTT GGTCCCATTG CAAGAAGTGC ctatcgatag attcgtcato gacatgaaaa 

6601 GAGACGTGAA agttacacct ggcacgaaac acacagaaoa aagaccgaaa ctacaactca tacaagccgc agaacccctg gcgaccgctt acctatgcgg 
6701 gatccaccgo gagttagtgc gcaggcttac acccgttttg ctacccaaca ttcacaccct ctttgacatg tcggcggagg actttgatgc aatcatagca 
6101 gaacacttca agcaaggtca cccggtactg gagacggata tcgcctcgtt cgacaaaagc caagacgacg ctatggcgtt aaccggcctg atgatcttgg 

'6901 AAGACCTGGG TCTGCACCAA CCACTACTCG ACTTCATCGA GTGCCCCTTT GGAGAAATAT CATCCACCCA TCTGCCCACG GGTACCCGTT TCAAATTCGO 
7001 GCCGATGATC AAATCCCGAA TGTTCCTCAC CCTCTTTGTC AACACAGTTC TGAATGTCCT TATCGCCAGC AGAGTATTGG AGGACCGGCT TAAAAOGTCC 
7101 AAATGTCCAO catttatccg CCACGACAAC ATTATACACG GAGTAGTATC TGACAAAGAA ATGGCTGAGA ggtgtgccac CTGGCTCAAC ATGGAGGTTA 
720] AGATCATTGA CCCAGTCATC GGCGAGAGAC caccttactt ctgcggtgga TTCATCTTGC AAGATTCGGT tacctccaca GCGTGTCGCG TGGCGGACCC 
7301 CTTGAAAAGG ctctttaagt tgggtaaacc cctcccagcc GACGATGAGC aagacgaaga CAGAAGACGC gctctcctag atgaaacaaa ggcctgottt 
7401 ACAGTAGGTA taacagacac cttagcaoto gccgtggcaa ctcggtatoa gotagacaac atcacaccto tcctgctcgc attoagaact tttccccaga 
7J0I GCAAAAGAGC atttcaaccc atcagagcgg aaataaagca tctctacggt GCTCCTAAAT agtcagcata gtacatttca tctgactaat accacaacac 

7601 CACCACCATO AATACAGOAT TCTTTAACAT GCTCGGCCGC CG UXt 1 ILL CAGCCCCCAC TCCCATOTCO AGGCCGCGGA GAAGOAGGCA GCCGGCCCCG 
7701 ATGCCTGCCC GCAATGGCCT GGCTTCCCAA ATCCAGCAAC TGACCACAGC CGTCAGTCCC CTAGTCATTO GACAGGCAAC TAOACCTCAA ACCCCACGCC 
7»1 CACGCCCGCC GCCCCCCCAG AAGAAGCAGG CGCCAAAGCA ACCACCGAAG CCGAAGAAAC CAAAAACACA GOAGAAGAAG AAGAAGCAAC ctccaaaacc 
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7901 CAAACCCCCA AACAGACACC GTATGGCACT TAAGTTGGAG CCCCACACAC TCTTCCACGT CAAAAATGAO GACGGAGATG TCaTCCCOCA CGCaCTOGCC 
1001 ATCCAACCAA AGGTAATGAA ACCACTOCAC CTOAAACGAA CTATTCACCA CCCTCTOCTA TCAAACCTCA AATTCACCAA OILGICA CCA TACGACATCO 
1101 AGTTCCCACA GTTGCCGGTC AACATGAGAA CTOACOCCTT CACCTACACC ACTCAACACC CTCAACCCTT CTACAACTCO CAOCAOCGAO CGGTGCAOTA 
1201 TACTCCACCC ACATTTACCA TCCCCCCCCO AOTACQACCC AGACQAOACA GT GGlLUU. CATTATCCAT AACTCACCCC CCGTTCTCOC GATAOTCCTC 



UOl CTCCTCCACC ACTGCTCACG CCCATCTCCT TCCTTGGAAA CCTGaGCTTC CCATCCAATC CCOCCCCCAC ATCCTACACC CGOQAACCAT CCACAGCrCT 
tSOI CCACATCCTC GAACAOAACC TGAACCACCA GGCCTACQAC ACCCTCCTCA ACCOCATATT C C CC T C CCCA TCGTCCGGCA GAACTAAAAG AAGCOTCACT 
UOl OACOACTTTA CCTTCACCAG CCCGTACTTG CGCACATGCT CGTACTGTCA CCATACTOAA CCCTCCTTTA CCCOCATTAA CATCCACCAO CTCTCCOATO ' 

mi AACCCOACOA caacaccata cccatacaga cttcccccca ctttccatac caccaaaccc caccaccaao ctcaaataag tacccctaca tgtcgctcga 

UOl CCACCATCAT ACTCTCAAAG AACCCACCAT COATCACATC AAOATCACCA CCTCACOACC CTCTACAACO CTTAGCTACA AAOCATACTT ILlLLlLUOj 
1901 AAGTCTCCTC CAGGGGACAG CGTAACGGTT AGCATAGCGA GTAGCAACTC AGCAACGTCA TGCACAATGG CCOCCAAGAT AAAACCAAAA TTCGT GGG AC 
9001 GGGAAAAATA TGACCTACCT CCCGTTCACG GTAAGAAGAT TCCTTCCACA GTGTACGACC GTCTGAAAGA AACAACCGCC GGCTACATCA CTATOCACAG 
9101 GCCCGGACCO CATCCCTATA CATCCTATCT GGAGGAATCA TCAGGQAAAO TTTACGCGAA GCCACCATCC GGGAAGAACA TTACGTACGA GTCCAAOTGC 
9201 GGCGATTACA AGACCGGAAC cgttacgacc CGTACCOAAA TCACCGCCTO CACCGCCATC AAGCAGTGCG t cgc ci a taa GAGOGACCAA acoaaotgcg 
9301 TCTTCAACTC GCCCOACTCC ATCAOACACO CCGACCACAC GGCCCAAGGG AAATTGCATT TOLL 1 1 I L AA GCTGATCCCO AGTACCTGCA TO G T CCCIUT 
9*01 TGCCCACGC0 CCGAACGTAG TACACGCCTT TAAACACATC AGCCTCCAAT TAGACACAGA CCATCTCACA TTCCTCACCA CCAGOAGACT AGGGGGAAAC 
9501 CCGGAACCAA CCACTOAATG GATCATCGGA AACACGGTTA GAAACTTCAC CGTCGACCGA OATCGCCTGG AATACATATG GGGCAATGAC GAACCAOTAA 
9601 CCGTCTATCC CCAAGACTCT GCACCAGGAG ACCCTCACGG ATGGCCACAC GAAATAGTAC ACCATTACTA TCATCGCCAT CCTGTGTaCA CCATCTTAGC 
9701 CGTCGCATCA GCTGCTG1GG CCATGATOAT TCGCGTAACT G1TGCAGCAT TATQTGCCTG TAAAGCCCGC CGTGACTGCC TGACGCCATA TOCCCPGCCC 
9»1 CCAAATGCCG TGATTCCAAC TTCCCTCGCA CTTTTGTGCT GTGTTAGGTC CGCTAATGCT GAAACATTCA CCGAGACCAT GAGTTACTTA T GG TC GA ACA 
'9901 CCCAGCCOTT CTTCTGGGTC CAGCTGTGTA TACCPCTGCC GGCTCTCGTC UllLlAATGC UL10MOL1C ATGCT GCCT G CCTTTTTTAO TGGTTGOOGG 
10001 CGCCTACCTG CCGAACGTAG ACGCCTACGA ACATGCCAOC ACTGTTCCAA ATGTGOCACA GATACCGTAT AAGGCACTTC TTGAAAGGGC AGGGTAOGCC 
10101 CCGCTCAATT TGGAGATTAC TGTCATCTCC TCGGAGGTTT TCCCT1ULAC CAACCAAGAG tacattacct gcaaattcac cactgtcctc CC CT C COCT A 
10201 AAGTCAGATG CTGCGGCTCC TTCGAATGTC A GCCCGCmC TCACGCACAC TATACCTGCA AGGTCTTTGG AGGGGTGTAC CCCTTCATGT GGGGAGOACC 
10301 ACAATCTTTT TCCGACAGTG agaacagcca GATGAGTGAO ccgtacctcg aattctcagt agattgcgco ACTOACCACG CGCACGCGAT TAAGGTGCAT 
10*01 ACTGCCGCOA TGAAAGTAGG ACTGCOTATA GTGTACGGGA ACACTACCAG 1 1 ILL! A G AT GTGTACGTGA ACGOAGTCAC ACCAGGAACG TCTAAAGACC 
I0S01 TGAAAGTCAT AGCTGGACCA ATTTCAGCAT TGTTTACACC ATTCOATCAC AAGGTCGTTA TCAATCCCGG CCTGGTGTAC AACTATGACT TTCCGGAATA 
10601 CGGAGCGATG AAACCAGGAG CGTTTGOAGA CATTCAAGCT ACL ILL 1 1UA CTAGCAAAGA CCTCATCGCC AGCACAGACA TTAGGCTACT CAACCCTTCC 
10701 CCCAAGAACG TGCATCTCCC GTACACGCAO CCCGCATCTG GATTCCAGAT GTGGAAAAAC AACTCACCCC GCCCACTCCA CGAAACCCCC CCTTTTGCGT 
10S0I GCAAGATTGC AGTCAATCCG CTTCGAGCGG TGGACTGCTC ATACGGGAAC ATTCCCATTT CTATTGACAT CCCGAACCCT GCCTTTATCA GGACATCAGA 
10901 TGCACCACTG GTCTCAACAG TCAAATGTGA TGTCAGTGAO TCCACTTATT CACCGGACTT CGGAGCGATG GCTACCCTGC AGTATGTATC CGACCGCGAA 
1 1001 GGACAATGCC CTGTACATTC GCATTCGAGC ACAGCAACCC TCCAAOAGTC GACAGTTCAT G T C CT G GAGA AAGGAGCCGT GACAGTACAC TTCAGCACCO 
1 1101 CGAGCCCACA GGCOAACTTC ATTGTATCGC TGTGTGGTAA gaagacaaca TGCAATGCAG aatgcaaacc ACCACCTGAT catatcgtga GCACCCCGCA 
1 1201 CAAAAATGAC CAAGAATTCC aagccgccat ctcaaaaact tcatgoagtt ggctgtttgc ccttttccgc gg cgc ctcgt ccctattaat tataggactt 
1U01 atcatttttg cttgcagcat gatgctgact agcacacgaa gatgaccgct accccccaat gacccgacca gcaaaactcg atctacttcc gaggaactca 
1 1*1 tctgcataat gcatcagcct ggtatattao atcccccctt accgcgcgca atatagcaac accaaaactc gacctatttc cgaggaagcg cagtgcataa 
1 1901 tcctccccag tcttgccaaa taatcactat attaaccatt tattcagcgg acgccaaaac tcaatgtatt tctgaggaag catggtgcat aatgccatcc 
I i«m agcgtctgca taacttttta ttatttcttt tattaatcaa caaaattttg tttttaacat ttc 



1301 



GGAGGGGCTG ATGAGGGAAC AACAACCGCC CrTTCOOTOG tcacctggaa tagcaaaggg aagacaatca agacaacccc ggaagggaca gaagactggt 
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Amino Acid Sequence of the Nonstructural Polyprotein 



I METFWNVDV DPQSrFWQL QXSFTQFEW AQQVTWDHA NARAFSHLAS K14ELEVFTT ATHJUCSAP ARRMFSEHQY KCVCPWUPH DFORMMKYAS 

101 KLAEKAOCIT NKNLHEXOCD UtTVUTTTOA ETF3LCFHMD VTCNTRAEYS VMQDVYINAF GTTYHQAMKO VRTLYWR7FO TTQFWPSAMA CJTPATKTKW 

Ml ADEXVLfiARN ICLOTXUE GRTGKUIMR EKEUCFGSRV YPlVCSTLYf EHRASLQSWH LFSVFHUCGK QSYTCRCOTV VJCEGYWTK IllWllUt 

301 TVOYAVTNNS EGFLLOCYTD TVKGERVSF? VCTYTFATIC DQMTRMATD BPOOAQXLL VQLNQUVIN GXTMtNTNTM QNYLLTOAQ ORXWAKOtX 

401 EDLDNEKMLO TRERKLTTCC LWAPXTXXVH &MEFFUU1 tVKVPASPSA PFMUVWTTS LFMB BQTMX LALQFKKEEX LLQVKELVM EAJCAAFEDAQ 

501 rc CT Airyf w aUTLVADJCG IEAAAEWCT VEGLQADTGA ALVSTTKOKV ROTQANDW* tGQYTWSII SVUCNAXLAT AJfflADQVKI mOOVQXY 

601 AVCTTDAKVL MFACttAVFWr EFLALSESAT LVYNEREFVN RCLYWAMKG PAKNTEEEQY KVTKAELAST EYVFOVDXXR CVXKEBASQL VUC8UMTP 

701 YHEUOEGUC TRPAVTVKVE T IG V tff TP GI GXSAIDC3TV TAROLVTSGX KENOEEAO VLRLRGMQTT 9CTVDSVMLN CCMCAVEVLY VDEAF1CHAG 

101 ALLAUAIVt PRKXWLCCD FKQCGFFKMM QUCVHFNHPE K0OXTTY1C FBRRCTQFV TAIYSTLHYD GKV ULI 1NF L L KMEIOfTGA I RJKJQUUL 

001 TCniOWVKQL QIDYPCHEVM TAAASQCLTK KCVYAVRQKV NENFLYAIR EHVNVLLTRT EDRLVWKTLQ GOFWDCQLTN VTKGKFQAT1 EOWEAEKCC1 

1001 (AAMSPAFR TNPPSOCTNV CWAXALEP1L ATACIVLTCC QWSELFFQFA DDXPHSAIYA LOVOCFFO UOLTSCLFSC QHFLTYWA OlARFVAHWD 

1101 NSFOTIKYGY DHAVAAELSR RFPVFQLACX CTQLOLQrrCR TRVBAQHNL VFVHRNLPKA LVPEHXBCQP GFVEKFL3QF KHHSVLV8B XXIEAPKKU 

1201 EWIAFIG1AO ADKNYNLAfC FFPQARYDLV FIMGTXYRN KKFQQCEDHA ATUCTLSUA LNCUffCCTL WKJYGYADR NSEDWTALA UCFVftVSAAK 

1301 PECVSSKTEM YUFKQLOKS RTRQFTTHHL NCVBJVYEO TRDOVGAATS YVTKUMAD CQESAWNAA NPLORFGBGV CftAfYKXWPN SFTDSAT8TC 

1401 TAXLTVCQGK KVTHAVCTOF RKHFEAEALX LLQNAYHAVA DLVNEHNIXJ VAIPLUTQ YAAGXORliV SLNCLTTALO RTCASVTTYC LDXKWXEftffi 

1501 AVLOUCESVT ELKBEDMBD DELVWTHPOS dJCCRXCFST TXGKLYSYFE GTKFHQAAJtD MAEOCVLFFN DQESNEQLCA YILG2TMEAI REXCFVDKNF 

IfiOl SSSnKTIK LCMYAMTfER VHRUUNNVK EVTVCSSTfL FKYKDCHVQK VQCnCWLW FHTPAFVFAR KY1EAFEQFA AJTAQAEEAF OWATTTTfA 

1701 AOKTSLDVTD BLDUEOSSB GSLFSSFSGS ONY1WJVWA DVHAVQEFAF VPFfKUCKMA R1AAARMQEE FTFFAffTSSA OOLHUFOO VSBKBLR) 

1101 GEMAR1AAAQ FPASTCmv PMSFGSFSDO ESELSRRVT ESEPVLFGSF EfCHVNSID nUAVSJTTR KQRRRRRSRR TSYCLTCVCa UFAHUUFU 

1901 HLQKK5VLQN QLTEFOERN VLERIYAFVL PT3TEEQUCL RYQMMFTEAN KSRYOSRXVB NQKAJTTERL UGLRLYNSA TDQPICYXTT YTKF5YS5SV 

2001 PANYSDPKFA VAVCNNYLKE NYTTVAIYOI TDEYSAYLOM VDCTTVACLDT ATFCFAKUtS YTKKHEYXAF N1RSAVPSAM QNTLQNVUA ATXXNCHVTQ 

2101 MROJTLDSA TTHVtCmX ACNDEYWEEF AWCTOHTTB FVTAYVARLX CPKAAALFAX THNLVPLQEV FMDRFVMOMK RDVKV1TCTK ITTESXnCVQV 

2201 1QAADLATA YLCOKXav RRLTAVLLPN tHTL/DMSAS DFDAQaEHF KQCDFVLCTD 1ASF0XSQDD AMALTCLMlt EOLCVOQPLL 0LSCAFC£t 

2301 iSTHLFTOTR FKFOAMMXSO MFLTLFVKTV LNWIASRVL EERLKTSKCA AFIGDONltH GWSDKEMAfi RCATWLNMBV UDAVlGEft FFYFCOOFTL 

2401 QOSVnTACR VADfUCJtLfTC LGXFLFADDE QOEDUUtAU. DETXAWFRVO fTDTLAVAVA TRYEVDWTF VLLALRTFAQ SKRAFQA1SC EDCKLYCCPK 



Amino Acid Sequence of the Structural Polyprotein 



I MMLGFFNMLQ RRFFFAFTAM WRPRRRRQAA PMFARNOLAS QtQQLTTAVS ALVTOQATRF QTFRPRFFPR QKXQAFKQPP KFKKFCTQEK KXXQPAKFKF 

101 CXRQRMA1XL EADRLFDVKN EOCDVIOHAt AMEGKVMKFL HVKCTIDHFV LSKLKFTKSS AYOMEFAQLF VNMRSEAFTT TJEKFECFYN WKK0AVQYSG 

201 CRPTIFRCVG CRCDSCRFIM DNSCRWAfV LCGADECTRT ALSWTWNSX CXTDCTTTtO TEEWSAAFLV TAMCLLGNV5 FFCHRFFTCY TREFSRALOI 

301 LEENVNKEAY DTLLNAILRC CSSCRSKRSV TDDFTtTSPY LCTGSYCKKr EPCFSFOCIE CJVWOEAODNT tRIQTlAQFO YPQSOAASI W KYRYMSUQD 

40! HTVKEGTUDO DCOTJCfCR RLSYKCYFLL AKCFFGDSVT VS1ASSNSAT SCTMARKOCP KFVCREKYDL FFVHCKKIFC TVYDRLKSTT ACYTTMHRFO 

301 FHAYTSYUB SJCKVYAXFF SCXNTTYEOC CCDYKTCTTVT TRTKTCCTA DCQCVAYKSO OnCWVFWFO SIRHAOHTAQ CKLHUTKU FJT04VPVAH 

©1 AFNWHCFKH BLQUXTDHL TU.TTRRUCA NPEFTTEW1I CKTVRNFTVD RDCLBYWCN HEFVRVTAQB SAFCDFHOW? HUVQHYYHR HFVYTTLAVA 

701 SAAVAMMtGV TVAALCACXA RRECLTPYAL AFHAVOTSL ALLCCVRSAN AETFTHTMJY LWWSQfFFW VQLOFLAAV WLMRCCSCC LWLWAOAY 

Ul LAXVDAYEKA TTVFHVFQtF YXALVERAOY AFLNLETTVM SSEVLPSTMQ EYITOCFTTV VPSFKVRCCG SLECQf AAHA DYTOCVFCCV YFFMWGQAQC 

901 FCOSENSQMS EAYVEtSVOC ATDHAQAOCV KTAAMXVCLR rOfCKTTSFL DVYVHCVTFO TSKDLXYIAG FOALFTTFO KXWWRCLV YNYDFFEYCA 

1001 MRFOAFCStQ ATSLTSKOU ASTDIRLLXF SAXHVHVFTT QAASGFEMWK NNSCRFUJET APFOCXIAVH PtRAVDCSYO NIFtSDIFN AAFTRTSDAF 

1101 LVTTVKCOVS ECTYSADFCO MATUJYVSOR ECQCFVKSKS STATLQE5TV HVLEKCAVTV HF3TASFQAN FTVSLCCRXT TCNAECKFFA OHTVSTTHKN 

13)1 DQEFQAAISK TSWSWLFALF CCASSLLUC LMIFACSMML TSTRR 
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Nucleotide Sequence of Girdwood S.A. 



1 KTTOKCGGCG TAGTATACAC TATTCAATCA AACAGCOCAC CAATTCCACT ACCATCACAA TCCAGAACCC AGTAOTTAAC OTAGACOTAO ACCCOCAQAO 
101 TCCOTTTGTC GTGCAACTGC AAAACACCTT COCCCAATTT CACCTACTAO CACAGCAGGT CACTCCAAAT CAOC A T GC T A ATOOCAOACC ATTTTCGCAT 
201 CTCCOCAOTA AACTAATCOA CCTCOACOTT CCTACCACAG OCACCATTTT COACATACCC ACCCCACOCO CTCGTAGAAT OTTTTCCQAO CACCAGTACC 



401 CCATOAGAAO ATCAACOACC TC CCO ACCCr ACTTQATACA CCCOATOCTO A AA CG CCATC ACTCTCCTTC CACAACOATO TTAOCTCCAA CACGCOTGOC 
501 OAGTACTCCG TCATGCAGGA CCTCTACATC AACGCTCCCG GAACTATTTA CCATCAGGCT ATOAAAOOCO TGCGGACCCT CTACTCCATT CCCTTOOATA 
601 CCACCCACTT CAlOl 1 UUI GCTATGGCAG CTTCCTACCC TCCCTACAAC ACCAACTOCO CCCACOAAAA AOTCCTCOAA CCCajTAACA TCOOACTCTO 
701 CACCACAAAO CTCACTCAAC GCAGGACAGG AAAGTTGTCO ATAATCACGA ACAACCACTT CAA CCCOCCO TCACCCOTTT Al 1 lULLCI TOOATCQACA 
tOI CTTTACCCAC AACACAOAOC CACCTTCCAO ACCTCCCATC TTCC ATCCCT OTTCCACCTG AAAOOAAAGC AGTCGTACAC HUCLU-lbr OATACAOTOO 
901 TCACCTCCGA ACCCTACCTA CTCAACAAAA TCAOCATCAC TCCCGGOATC A CC COACAAA CCGTGGCATA CCCCCTTACA AACAATACCO ACCCCTTCTT 
1001 CCTATCCAAA OTTACCCATA CAOTAAAACG AGAACCCCTA TCCTTCCCCO TGTGCACGTA TATCCCCCCC ACCATATQCO ATCAGATGAC OGCCATAATO 
1 101 CCCACCCATA TCTCACCTOA CGATGCACAA AAACTTCTCG TTGCCCTCAA CCACCCAATC CTCATTAACC CTAAOACTAA CAOGAACACC AATAOCATGC 
1301 AAAATTACCr TCTCCCAATC ATTCCACAAC COTTCACCAA ATCCCCCAAO QACCCCAAAO AAGACCTTQA CAATQAAAAA ATOCTCCCTA CCAOAOACCO 
1301 CAACCTTACA TATCCCTCCT TCTOGCCGTT TCCCACTAAC AAAGTCCACT CCTTCTATCO CCCACCTGGA ACGCAOAOCA TCCTAAAACT CCCAG CC1L1 
1«01 TTTACCCCTT TCCCCATCTC ATCCCTATCC ACTAOCTCTT TGCCCATGTC CCTCACCCAO AACATAAAAT TGGCATTACA ACCAAACAAQ CACCAAAAAC 
1301 TGCTGCAAGT CCCCCAGCAA TTAOTCATCO ACO CC AACCC TCCTTTCCAC CATCCTCACG ACCAATCCAG AGCGGAGAAO CTCCGAOAAO CACTCCCACC 
1601 ATTAGTGGCA OACAAAGGTA TCGAGGCAGC C G CG OAA GT T GTCTGCGAAG TGOAGGGGCT CCAGGCGGAC ATCGGAGCAO CACTCGTCOA AACCOOGOGC 
1701 GQTCATQTAA GGATAATACC ACAAGCAAAT GACCGTATOA TCGOACAGTA CATCGTTGTC TCGCCAACCT CTCTGCTGAA GAACGCTAAA CTTGCACCAO 
1101 CACACCCGCT AGCAGACCAG GTTAAGATCA TAACGCACTC CGGAAGAtCA GGAAGOTATG CAGTCGAACC ATACGACGCT AAAGTACTGA TGCCAGCACO 
1901 AAGTGGCGTA CCATGGCCAG AATTCTTAGC ACTGAGTOAO AGCGOCACGC TAGTOTACAA CGAAAGAGAG TTTOTOAACC GCAAGCTGTA CCATATTGCC 
2001 ATGCACGGTC CCGCTAAGAA TACaGAAGAO OAGCAGTACA AGGTTACAAA CGCAGAGCTC GCAGAAACAO AGTaCGTGTT TGACGTGGAC AAGAAGCGAT 
2101 GCOTCAAOAA GGAAGAAGCC TCAGOACTTO TCCTCTCGCO AOAACTOACC AACCCGOOCT ATCACOAACT AGCTCTTOAO GOACrOAAGA CTCGACCCOT 
2201 GGTCCCGTAC AAGGTTGAAA CAATAGGAGT GATAGGOGCA CCAGGATCGG GCAAGTCGGC TATCATCAAC TCAACTUTCA CGGCAOGTGA TCTTCTTACC 
2301 AGCGGAAAGA AAGAAAACTG CCGCtJAAATT CAGGCCGATO TGCTACCCCT CAGGCCCATC CAGATCACGT COAAGACAGT GGATTCGGTT ATCCTCAACO 
2401 GATGCCGCAA AGCCGTAGAA GTGCTGTATO TTGAGQAAGC GTFCGCGTGC CACGCAGGAO CACTACTTCC CTTOATTGCA ATCGTCAGAC CCCGTCATAA 
2301 GGTAGTGCTA TGCGGAGACC CTAAGCAATG CGGATTCTTC AACATGATCC AACTAAAGGT ATATTTCAAC CACCCOGAAA AAGACATATO TACCAAOACA 
3601 TTCTACAAGT TTATCTCCCG ACGTTGCACA CAGCCAGTCA CGCCTATTGT ATCGACACFG CATTACCATO GAAAAATGAA AACCACAAAC CCGTOCAAGA 
1701 AGAACATCGA AATCGACATT acaggggcca cgaagccgaa gcca g gccac atcatcctga CATGCTTCCO CGGCTCGGTT AAGCAACTGC aaatcgacta 

M01 tcccggacat gaggtaatga cagccgcggc ctcacaaggo ctaaccagaa aaggagtata t gc cgt ccg o caaaaagtca atgaaaaccc gctgtacgcg 

2901 ATCACATCAG AGCATGTGAA CGTGCTGCTC ACCCGCACTG AGOACACGCT AGTATGGAAA aCTTT ACAGG GCGACCCATG GATTAACCAC CTCACTAACG 
3001 TACCAAAAGG AAATTTTCAA GCCACCATCG AGGACTGGGA AGCTGAACAC AAGGGAATAA TT GC T G CGAT AAACAGTCCC CCTCCCCGTA CCAATCOOTT 

3101 CAGCTGCAAG actaacgttt gctgggccaa ACGACTGGAA CCGATACTGG CCACGCCCGG tatcgtactt accggttgcc agtcgagcga gctcttocca 

3201 CAGTTTGCAG ATOACAAACC ACACTCGCCC ATCTACGCCC TGGACGTAAT CTGCATTAAG TTTTTCCGCA TGCACTTGAC AACCGGACTG TTTTCCAAAC 
3301 AGACCATCCC GTTAACGTAC CATCCTCCCG ATTCAGCGAG GCCAGTAGCT CATTCGGACA ACAGCCCAGG AACCCGCAAG TATGGGTACG ATCA COC CCT 
3*01 TGCCCCCCAA CTCTCCCGTA GATTTCCGDT GTTCCAGCTA GCTGGGAAAG GCACACAGCT TGATTTCCAG ACGGGCAOAA CTAGAGTTAT CTCCCCACAO 
3501 CATAACTTGO TCCCAGTPAA CCGCAATCTC C CGC A CG CCT TAGTCCCOOA GCACAAGGAG AAA C AACCCO GCCCGGTCAA AAAATTCTTG AGCCAGTTCA 
3601 AACACCACTC CGTACTTGTG GTCTCAGAGG AAAAAATTOA AGCTCCCCAC AAGAGAATCG AATGGATCGC CCCGATTGGC ATAGCC G C CO CTGATAAOAA 
3701 CTACAACCTQ GCTTTCGGGT TTCCGCCGCA GGCACGGTAC GACCTGGTC1 TTATCAATAT TGGAACTAAA TACAGAAACC ATCACTTTCA GCAGTGCGAA 



301 ATTOCGTTTO CCCCATGCGT AGTCCAGAAG A CC C GOACCO CATOATGAAA TATQCCAGCA AACTGGCGGA AAAAGCATGC AAGATTACGA ATAAOAACTT 




WO 98/36779 



PCT/US98/02945 



6/12 



3101 OACCATCCCO COACCTTCAA AACCCTCTCG LU I ICOOCCC TCAA CIUX I T AA CCCCC OA G OT A LU. I IP TGOTCAACTC CTACGGTTAC '"■^fTO HY 
3901 ATAGTQAGGA CGTAOTCACC CCTCTTCCCA GAAAATTTGT CACACTCTCT CCACCCACCC CAGACTCCGT CTCAACCAAT ACAOAAATOT ACCTQATCTT 
4001 COCACAACTA GACAACACCC GCaCACGACA ATTCACCCOC CATCATCTGA ATTGTGTOAT HU.1U.U I U TACCAGOGTA CAAGAflACGG AOTTGGACCC 
4101 CCACCCTCAT ACCCCACTAA AA C GCAGAAC attcctoatt gtcaacacoa accacttctc aatccaccca atcccctgog cagaccaggc OAAOOAOTCT 
4201 CCCCTCCCAT CTATAAACCT TOCCCOAACA CTTTCACCCA TTCACOCACA CAOACCCCCA CCCCAAAACT CACTOTOTGC CAACCAAAGA AACTOATCCA 
4301 CCCCCTTCO: CCTCATTTCC CGAAACACCC ACACGCAGAA GCCC1U AAAT TCCTCCAAAA CCCCTACCAT CCAGTGGCAG ACTTACTAAA TCAACATAAT 
UQ\ ATCAACTCT G TCGCCATCCC ACTCCTATCT ACACCCATTT ACCCAGCCGG AAAAGAOCCC CTTCAAGTAT CACTTaaCTO CTTOACAACC GCGCTAOATA 
4301 CAACTOATCC CCAOCTAACC ATCTACTCCC TCCATAAOAA OTGCAACCAA ACAATOOACO C CC T CC TC C A ACTTAAGGAG TCT C T A ATAO AGCTaAAOGA 
4601 TGACGATATC GAOATCGACO ACQAGTTAGT ATGGATOCAT COGOACAOTT GCCTGAAGGG AAGAAAGGQA TTCAGTACTA CAAAAGGAAA GTTGTATTCO 
4701 TACTTTGAAG GCAOCAAATT CCATCAAGCA GCAAAAOATA TGGCCGACAT AAAGGTCCTG TT CCCA AATO AGCAGGAAAG CAAOGAGCAA CTGTOTGCCT 
4<01 ACATATTGGG GGAGAOCATG GAAGCAATCC GCGAAAAATQ CCCGGTCCAC CACAACCCGT CG T CIAGCCC GCCAAAAACO CTG CCO T CCC TCT GCA TOTA 
4901 TGCCATGACG CCAGAAAGGG TOCACAGACT CAGAAGCAAC AACGTCAAAG AAGTTACAGT ATGCTCCTCC AUCMl l I I CAAAGTACAA AATCAAOAAC 
9001 GTTCAGAAGG TTCAGTCCAC AAAAGTAGTC UUIIIAACC CGCATAOCOC TCCATTCGTT CCCCCCCGTA AGTACATAOA AGCOCCAOAA CACCCTCCAO 
3101 CT CCGC CTGC ACAGGCCGAG CAGGCCCCCG AAGTTGCAGC AACACCAACA CCACCTGCAG CTGATAACAC CTCGCTTOAT GTCACCCACA TCTCACTGOA 
3301 CATGGAAGAC AGTAGCGAAG GCTCACTCTT TTCGAGCTTT AGOGGATCGG ACAACTCTAT TACTAGTATG GACAGTTGGT CGTCAGGACC TAGTTCACTA 
3301 GAGATAGTAG ACCGAAGGCA GGTGGTCGTC GCTOACGTCC ATG CCO T CCA AGACCCTGCC CCTGTTCCAC CCCCAAGGCT AAAOAAGATG GC CCG OCT GO 
3401 CAGCGGCAAG AATGCAGGAA GAGCCAACTC CACCGGCAAG CACCAGCTCT GCGGACGAGT CCCTTCACCT TTCTTTTGGT GGGGTATCCA TOTCCTTCGG 
3501 ATCCCTTTTC GACGGAGAGA TGGGCGCCTT GCCAGCGGCA CAA CCC C CG G CAAGTACATG CCCTACGGAT CTGCCTATGT CTTTCGGATC GTTTTCCGAC 
3601 GGAGAGATTG AGGAGCTGAG GCGCAGAGTA ACCOACT C T G AG CCCG T C C t GTTTGGGTCA TTTGAACCGG GOGAAGTGAA CTCAATTATA TCGTCCCGAT 
3701 CAGTTGTATC TTTTCCACCA CGCAAGCAQA GACGTAGACG CAGGAGCAGG AGGACCGAAT ACTGACTAAC CGGGGTAGGT GGGTACATAT TTTOQACGGA 
SKI CACAGGCCCT GGGCACTTGC AAATGGAGTC CGTTCTGCAG AATCACCTTA CAGAACCGAC CTTGGAGCGC AATGTTCTGG AAAGAATCTA C CCCC C GOTO 
3901 CTCGACACGT CGAAAGAGGA ACAGCTCAAA CTCAGGTACC AGATOATGCC CACCGAAGCC AACAAAAGCA GGTACCAGTC TAGAAAAGTA GAAAATCAGA 
6001 AAGCCATAAC CACTGAGCGA CTCCTTTCAG GGCTACGACT CTATAACTCT GCCACACATC AGCCAGAATG CTATAAGATC ACCTACCCGA AACCATCGTA 
6101 TTCCAGCAGT GTACCGGCGA ACTACTCTGA CCCAAAGTTT GCTGTAGCTG TTTGCAACAA CTATCTGCAT GAGAATTACC CGACGOTAGC ATCTTATCAO 
6201 ATCACCGACG AGTACGAT6C TTACTTCOAT ATGGTAGACC GGACAGTCGC TT GCC T A GAT ACTCCAACTT TTTGCCCCGC CAAGCTTAGA AGTTAOOOGA 
6301 AAAGACACGA GTATAGAGCC CCAAACACTC GCAGTCCGGT TCCATCAGCG ATGCAGAACA CGTTGCAAAA CGTGCTCATT GCCGCOACTA AAAGAAACTQ 
6401 CAACGTCACA CAAATGCGTO AATTGCCAAC ACTGGACTCA GCGACATTCA ACGTTGAATG CTTTCGAAAA TATGCATGTA ATGACGAGTA TTGGGAGOAO 
«01 TTTGCCCGAA AGCCAATTAG GATCACTACT GAGTTCGTTA CCGCATACGT GGCCAGACTG AAAGCCCCTA AGGCCGCCGC ACTOTTCGCA AAGACGCATA 
6601 ATTTGGTCCC ATTGCAAGAA GTGCCTATGG ATAGGTTCCT CATGGACATG AAAAGAGACG TGAAAGTTAC ACCTGGCACG AAACACACAO AAGAAAGACC 
6701 CAAAGTACAA GTCCTACAAG CCGCAGAACC CCTGGCCACC GCTTACCTOT CCGGGATCCA CCGGGAOTTA GTGCGCAGGC TTACAGCCGT CTTGCTACCC 
6»1 AACATTCACA CGCTTTTTGA CATGTCGGCG CAGGACTTTG ATGCAATCAT AGCAGAACAC TTCAAGCAAG GTGACCCGCT ACTGGAGACG GATATCGCCT 
6901 CGTTCGACAA AAGCCAAGAC GACGCTATGG CGTTAACTGG CCTOATGATC TTCGAAGACC TGGCTGTGOA ccaaccacta ctcgacttga tcgagtgccc 

7001 ctttcgagaa atatcatcca cccatctgcc cacgcgtacc cctttcaaat tcggggcgat gatgaaatcc cgaatgttcc tcaccctcit tgtcaacaca 
7101 CTTCTGAATG tcgttatcgc cagcagagta ttggaggacc ggcttaaaac gtccaaatgt gcagcattta tcggcgacga caacatcata cacggagtag 
7201 tatctcacaa agaaatgcct gagaggtgtg ccacctggct CAACATGCAG GTTAAGATCA TTGACGCAGT catcgccgao agaccgcctt acttctgcgg 
7301 TGOA7TCATC TTGCAAGATT cggttacctc cacaccgtct c gc gt gg cgo accccttgaa aaggctgttt aagttgggta aaccgctccc acccgacgac 
7401 GAGCAAGACG AAGACAGAAG ACCCCCTCrG ctacatgaaa caaaggcgtg gtttagagta ggtataacao acaccttagc agtggccgto gcaactcggt 

7301 ATGAGGTAGA CAACATCACA C CTgT C CT GC TGGCATTGAG AACTTTTGCC CAGAGCAAAA GAGCATTTCA AGCCATCACA GGGGAAATAA AGCATCTCTA 
7601 CGGTGGTCCT AAATAGTCAG CATAGCACAT TTCATCTGAC TAATACCACA ACACCACCAC CATGAATAGA GGATTCTTTA ACATG C T CG O CCGCOGCCCC 
7701 TTCCCGGCCC CCACTGCCAT GTGGAGGCCG CGGaGAAOOA GGCAGOCGGC CCCGATGCCT GCCCCCAATG G U.1UGU It CCAAATCCAG CAACTOACCA 
7801 CACCCGTCAG TGCCCTAGTC ATTGGACAGG CAACTAGACC TCAAACCCCA CGCCCACGCC CGCCGCCGCG CCAGAAGAAG CAGCCGCCAA AGCAACCACC 
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7901 GAAGCCGAAG AAA C CAAAAA CACACGAGAA GAAOAAOAAfi CAACCTGCAA AACCCAAACC CG GAAACAOA CAACCTATCO CACTgXAflTT 
B01 AGACTCTTCG ACCTCAAAAA TCACQA C CGA GATGTCATCG GGCACGCACT CGCCATCCAA CCAAACCTAA TCAAACCACT CCACgTOAAA CGAACTATTO 
1101 ACCACCCTCT CCTATCAAAC CTCAAATTCA CCAAGTCGTC AOCATACQAC ATCGAGTTCG CACAGTTGCC GGTCAACATO AGAAGTOAGG COTTCaCCTA 
OOt CAOCACCGAA CACCCTOAAG CCTTTTACAA CTCCCAOCAC GGAGCGGTGC AGTATAOTGO ACCTaOATTT ACC A TC C C CC CCCCAOTACO AOOCAOAGQA 
SX1 GACAGTGGTC OTOCGATTAT GGATAACTCA 6C0C0GOTT0 TCCCCATACT CCTCGGAGGO GCTOATOAGO OAACAAGAAC TOOOCTTTOO OTCOTCACCT 
KOI GQAATaCCaA AGGGaAGACA ATCAAGACAA CCCCCGaAGG QaCaGAAOAQ TGGTCTGCAG CACCaCTCGT CaCCCOCATQ TCCTTGCtTQ GAAACGTGAO 
130! CTTCOCATCC AATCGOOCGC CCACATCCTA CACCOGOGAA OCATCCACAC CTCTTGACAT CCTTGAAOAG AACCTOAAOC ACCACCCCTA COACACCCTO 
UDl CTCAACCCCA TATTGCGGTG CCCATCCTCC CCCACAACCA AAAGAACCCT CACTCACCAC TTTAOCTTCA CCACCOCCTA CTTCCCCACA TGCTCGTACT 
nOl CTCACCATAC TCAACCCTCC TTTACCCCCA TTAAOATOOA CCACQTCTCO GATGAAGCGG ACCACAACAC CATAOGCATA CAOACTTCCO CTCA OinUJ 
UOI ATACGACCAA AGCGGAGCAG CAAGCTCAAA TAAGTACCGC TACATGTCGC TCGAGCAGGA TCATACCCTC AAAGAAGGCA CTATGGATGA CATCAAGATC 

koi agcacctcag gaccgtgtag aaggcttagc tacaaacoat actttctcct cgcqaagtct cctccagggg acagcgtaac gottagtata GCGAGTAGCA 

9001 ACTCAGCAAC GTCATGCACA ATGGCCCGCA AGATAAAACC AAAATTCGTG GGACGGGAAA AATATGACCT ACCTCCCGTT CACGGTAAOA AGATTCCTTC 
9101 CACAGTGTAC CACCGTCTCA AAGAAACAAC CCOCCCCTAC ATCACTATGC ACAGGCCGGG ACCGCACCCC TATACCTCCT ATCTGGAGGA ATCATCAGGG 
m\ AAAGTCTACG cgaagccacc ATCCGGAAAG AACATTACGT ACCAGTGCAA GTGCGCCOAT ta c aa g a c co GTACCOTTAC gacccgtacc GAAATCACOG 
9301 GCTGCACCGC CATCAAGCAG TGCGTCGCCT ATAAGAGCGA CCAAACGAAG TGGGTCTTCA ATTCGCCGGA CTTGATCAOA CATGCCGACC ACACGGCCCA 
9401 AGGOAAATTG catttacctt TCAAGCTGAT CCCCAOTACC TGC A TG C T CC CTGTTOCCCA CGCGCCGAAC otaotacacg gctitaaaca CATCAGCCTC 

9501 caattagaca cagaccacct oacattgctc accaccagga gactaggggc aaatccggaa ccaactactg aatggatcat cggaaaoacg gttagaaact 

9«0I TCACCGTCCA ccgagatggc ctggaataca tatgggccaa tcacgaaccg gtaagggtct atccccaaga otctgcacca ggagaccctc acggatogcc 

9701 acacgaaata gtacagcatt actaccatcg ccatcctctc tacaccatct tagccgtcgc atcagctgct gtggcoatca tgattoccgt AA C1UUOC A 

9S0I gcattatgtc cctgtaaagc gcgccgtgag tccctcacgc catatgccct ggccccaaat g cc gt ga ttc caacttcgct cgcacttttg tgctgtotta 

9901 cgtcggctaa tcctgaaaca ttcaccgaga ccatgagtta cctatggtog aacagccagc cattcttcto cgtccagctg tgtatacccc tcccccctot 

10X1 CATCCTTCTA ATGCGCTGTT GCTCATCCTO CCTGCCTTTT TTAGTGGTTO CCGGC GC CTA CC T GO CCAAO GTAGACGCCT ACGAACATGC GAOCACTGTT 
10101 CCAAATGTGC CACAGATACC GTATAAGGCA CTTGTTGAAA GGCCAGGGTA CCCCCCGCTC AATTTGGAGA TTACTGTCAT GTCCTCGGAG GUIIOU-I T 
10201 CCACCAACCA AOAOTACATC ACCTGCAAAT TCACCACTCT OU1U-LL1LC CCTAAAGTCA AATGCTGCOO CTCCrTOOAA TGTCAGCCCO CCGCTCACCC 
10301 AGACTATACC TGCAAGOTCT TTGGAGGGGT GTACCCCTTC ATOTGGGGAG GAGCACAATG TTTTTGCGAC AGTGAGAACA GCCAGATGAG TGAGGCGTAC 
10401 CTCGAATTGT CAGCAGATTO CGCGACrCAC CACGCGCAGG CGATTAACGT CCATACTGCC GCGATGAAAG TACCACTACC TATAGTCTAC CGGAACACTA 
10501 CCACTTTCCT AGATOTGTAC GTGAACGGAG tcacaccagg aacgtctaaa gacctgaaao TCATAGCTCG ACCAATTTCA CC A 1 LOU I a caccattcga 
10601 TCACAAGGTC CTT ATCCATC CCCGCCTCGT GTACAACTAT GACTTCOCGG AATACGGAGC CATGAAACCA CGAGCCTTTG GAGACATTCA AGCTACCTCC 
10701 TTGACTAGCA AAGATCTCAT CGCCAGCACA GACATTAGAC TACTCAAGCC TTCCGCCAAO AACCTGCATO TCCCOTACAC GCAGCCCGCA TCTGGATTCO 
10101 AGATCTCGAA AAACAACTCA GGCCGCCCAC TCCACGAAAC CGCCCCnTC GGGTCCAAGA TTGCACTCAA TCCGCTTCGA GCGCTGGACT GCTCATACCG 
10901 CAACATTCCC ATCTCTATCG ACATCCCGAA CGCTGCCTTT ATCAGGACAT CAGATGCACC ACT CG TCT C A ACAOTCAAAT GTGATGTCAO TGAGTGCACT 
U001 TACTCACCGG ACTTCGGCGG GATCGCTACC CTGCAGTATG TATCCGACCG CGAAGGACAA TGCCCTOTAC ATTCGCATTC GAGCACAGCA ACCCTCCAAG 
1 1 101 ACTCQACAGT TCATGTCCTO GAGAAAGGAG CGGTQACAGT AGACTTCAGC ACCCCCAGCC CACAGGCGAA CTTTATTCTA tC GC 'IGTGIU GTAAGAAOAC 

luoi aacatccaat gcagaatgca aaccaccagc tgaccatatc gtqagcaccc cgcacaaaaa tcaccaagaa ttccaagcco ccatctcaaa aacttcatcg 
1 1301 agttccctgt ttgccctttt cggcggcccc tcgtcgctat taattatagg acttatgatt tttgcttgca gcatgatgct gactagcaca cgaagatgac 

1 1401 CGCTACCCCC CAATCACCCG ACCAGCAAAA CTCGATGTAC TTCCGAGGAA CTGATGTGCA TAATGCATCA GGCTGGTATA IT AGA rC CCC CCTTACCCCG 
11501 CGCAATATAG CAACACCAAA ACTCGACGTA TTTCCGAGGA AGCGCAGTGC ATAATGCTGC GCAGTGTTGC CAAATAATCA CTATATTAAC CATTTATTTA 
1 l«I CCGGACGCCA AAACTCAATG TATTTCTGAG GAAGCATGGT CCATAATGCC ATGCAGCGTC TGCATAACTT TTTATTATTT CTTTTATTAA TCAACAAAAT 

1»701 tttgttttta acattth 
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Giidwood S.A. 



Pi. Amino Acid Sequence of the Nonstructural Polyprotein 

1 MEKPWNVDV DKJSPPWQt OXSPPOPCW AQQVTTNDHA HAKAPSKLAS KUEL8VCT ATTLDtGSAP AftRMFSEKQY HCVCPU1SPE OPDUOdCYAS 

lOi KLAEKAOCJT NXNLHEUCD LBTVLDTTDA ETTSLCPHND VTOCnASYS VMQOVYIHAP GTTYXQAMXO VKTLYWIGFD TTQPMFSAMA OST7ATKTNW 

301 AOEXVUAftN K5LCTTXUE CSTCKLSMI KXELKPGSRV YPSVGSTLYP EHXASLQSWH UCTFKUCCK QSYTOtCOTV VKEGYWKX IIUfUUUE 

301 TVCYAVTNKS EGFLLOCVTD TVK GPVgr VCTYTPATIC DQKTCIMATD 8PD0AQKU. VGLKQUVM CCTNINTHTM QNYLLFOAQ Gf3XWAJ3ttJt 

401 EDLDWEKMIO TKEUELTYGC LWAFITOCVH 57YRPTUTQT IVKVPASFSA FPMSSVWTTS LPMSUtQXK LALCfCCEEC LLQVPEELVM EAXAAP8DAQ 

SOI EFHAFn gff AUtLVAOKG lEAAAEWCB VEGLQADtGA ALVETPROHV UFQANDftM CQYTYVSPT SVUCNAKLAP AHPLADQVK1 mSQUOKY 

ttl AVEPYDAKVL MPAOSAVPWp EFLAUESAT LVTNEUFVN ULTWAMKO PAKNTSEEQT KVTKAELAEY EYVFDVSKXJt CVKKEIASGL VLSOSLYMPP 

701 YHELALECLK TEPWPYKVE TTGVtGAPGS GK&AOCSTV TA&DLVTSCK KENCRSQAD VUtUUlMOn* STTVDSVMLN GOUCAVSVLY YDEAPACHAG 

101 AUAUAIVR PIKKWLCGD PXQCGFFNMM , QLXVYFNHPE KXtfCnCTPYK nflUtCTKJPV TAIVJTLHYD CKMKTTNPCX KMEIDITGA TKPKPOD1IL 

901 TCPWWVKQL QIDYPGHBVM TAAAiQGLTR KGVYAVRQKV NENPLYAR1 EHVHVLLTTT EOM.VWKTLQ ODPWDCQLTN VPKGNFQAYI EOWBAEMCG1 

1001 tAAMSPAPX TNPreOCTHV CWAXJILEPIL ATACTVLTGC QWTELFTQfA DDXPHSAlYA LDVICKFFO HOLTSCLFSX QStPVTYHPA DSAUVAHWD 

1101 NSPOTRXYGY DKAVAAfiLSA RFFVFQLACK GTQLDLQTGR TftVBAQHNL VPVWWLPHA LVPEHKEKQP CPVKKFLSQF KKHSVLWSB ECEAPHKU 

1301 EW1AP1G1AO ADKKYNLAFG PPPQARYDLV FTNUTTKYRH HKFQQCBDHA ATLXTLSUA LNCLNPGOTL WKSYOYAOR NSEDWTALA UCFVRVSAAJt 

1301 PECVSSNTEM YUFRQLDNS RTROFTTHHL NCVBSVYEG T&OGVOAAPS YXTKftEMAD CQEEAWKAA NPLCWQEOV CRAIYKKWPN SPTDSAYSTG 

1401 TAKLTVCQGK KVtHAVCPOP AXHPEAEAIX LLQNAYKAVA OLVNEHNKS VAIPLLSTGI YAAGXDRLEV SLNCLTTALD ETDA0VTTYC LDKKWKUUD 

U01 AVLQLXeSVl EUCDEDMEID DELVWTKPOS CLXCWCCFTT TKCXLYSYTB CTXFHQAAXD MAEDCVLFFN OQESNEQLCA YHjGETMEA! BEXCPVDKNP 

1601 SSSPPKTLFC LCMYAMTPEK VHRLKSKNVK EVTVLU1PL PKYXIKNVQK VQCTTWUN PHTPAFVPAB KYffiAPEQPA APPAQAESAP EVAATTTPPA 

1701 AOMTSLOVTD BL0MEDSSB CSLFSSFSCS ONSTTSUDSW SSGP5SLE3V DKRQVWAOV HAVQEPAPVP PPUXXMAftL AAARMQEEPT PPASTSXAOB 

1101 5UUFGGVS MSFGSLFDCE MCALAAAQPP ASTOTWPM 1FGSFSUCU EBURJIVTES EPVUGSPEP OTVNEBJ1 SWSPPPUCQ UUUUUUTB 

1501 Y 



Q t Amino Acid Sequence of the Structural Polyprotein 

1 MNKOFFNMIjG WtPFPAPTAM WRPRXULQAA PMPAIWGLA3 QTQQLTTAVJ ALYtCQATRP QTntPRPPPft QKXQAPKQPP KPKXPKTQEK UCKQPAXPXP 

101 GKftQtMAUCL EAOKLFDVKN EOCOVtCKAL AMEGXVMKPl HVXOnDHPV UKLKFTXSS AYDMBPAQLP VNMUEAPTY T MH P EO PYH WHHOAVQYSO 

301 CRPTTPROVC GRGDSGftPIM DNSGftWAIV LGGADEGTXT AtSWTWNSX Oil 1M I P EG TESWSAAPLV YAMCUCNVS PPOOPPTCY THEPSKALDf 

301 LEENVNHSAY OTUKAIULC CSSCXSXUV TODPTLTSPY LCTCSYCHHT EPCPSPOUS QVWOEAMNT DtigTSAQFO YSQSQAASSN KYKYMStEQD 

401 KTVKEGTMOD nCOTSGPOl KLSYKGYFLL AKCPPOOSVT VSlASSNSAY SCTMAftXOCP KFVCREXYDL PPVHOKK1PC TVYDUZETT AGYTTMHIPO 

301 PHAYTSYLES 33GKVYAKPP SCKNTTYECX CCDYKTCTVT TTTTEITCCTA DCQCVAYKSD QnCWVPfSPD UKHAOHTAQ GKUfLPPKU P5TCMVPVAH 

601 APNWHGFKH tSLQLOTDHL TtLTTOLCA NPEPTYEWO GKTVKNFTVD EDCLBYIWCN HEPVtVYAQI SAPCOPKGWP HEIVQHYYIB HPVYTTLAVA 

701 SAAVAMM1CV TVAALCAOCA RRECLTPYAL APKAVIPTSL ALLCCVKSAN A8TFTBYMSY LWSMQPPPW VQLQPLAAV IVUIKOCSCC UPFLVVAGAY 

Ml LAKVDAYEHA YTVPHVPOIP YKALVEXACY APLNLCnVH SSSVLP5TNQ EK HL1PU V VPSPKVKCCG SLECQPAAHA DYTOCYFCGV YPPMWOQAQC 

W1 FCSSSKSQMS EAYVEUAOC ATOHAQAKV KTAAUXVCUt IVYGNTTSFL DVYVNOYTTG T3XOLXV1AC POASPTPFD HXWIKKCLV YNYDPPEYCA 

1001 MXPGAFCDIQ ATSLTSCOU ASTDISLLXP SAKNVHVPYT QAASGFEMWK NNSCRPLQET APFCOUAVN PLRAVDCSYO NlPStOfPN AAFWT3DAP 

1101 LVSTVKCDVS ECTYSAOFCO MATLQYVSDR ECQCPVKSHS STATLQESTV HVLEKGAVTV HFSTASPQAN FTVSLCOOCT TOUEOCPPA OHIVTTTKKN 

1301 OQEFQAAISK TSWSWLPALf CCASSUJ1Q LMIPACSUML TSTWt 
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Nucleotide Sequence of S55 

I ATTGCCCOCO TAGTaCACAC TATTGAaTCA AACACCCGAC CAATTOCACT ACCATCACAA TGCAOAAOCC AGTACTTAAC ctagacctag accctcacao tu, iiu „, GTGCAACTGC 
121 AAAAOAOCTT CCCCCAATTT CACCTACTAC CACACCACCT CACTCCAAaT GACCATCCTA ATGCCaCAGC ATTTTCCCaT CTGGCCAGTA AACTGATCCA CCTGCAGGTT CCTACCACAC 

Ml CCACCATTTT ccacataccc agcgcaccgg ctcctacaat cttttccgag caccactacc attccctttc ccccatccct aotccagaao acccccaccc Catcatoaaa tatoccacca 

Ml AACTGGCCGA AAAAGCATCT AAGATTACAA A C AACAACTT CCaTGAGAAG ATCAACCA C C TCCGGACCCT ACTTUATACA CCGGATGCTG AAACGCCATC ACTCTOCTTC CACAACGATO 
«i TTACCTCCAA CACGCGTGCC GAGTACTCCG TCATGCAGGA CGTGTACATC AACGCTCCCG GAACTaTTTA CCACCACCCT ATCAAACCCO TCCCCACCCT gtactocatt oocttcgaca 
Ml CCACCCACTT CATCTTCTCO CCTATOCCAO CTTCCTACCC TBCATACAAC ACCAACTOCC CCCACCAAAA ACTCCTTCAA CCCCCTAACA TCCCACTUTC r *^A^AAA iO CTCAGTCAAC 
711 CC AC CACACO AAACTTCTCG ATAATCACGA ACAAGCAGTT n / UCC C CGGO TCACCCCTTT ATTTCTCCCT tccatccaca ctttacccac aacacacacc caccttccao agctggcatc 
Ml TTCCATCCCT CTTCCACTTC AAA O GAA A C C ACTCGTACaC TTCCCCCTCT CATACACTCC TCAU.1U.CA ACCCTACCTA gtCAAOAAAA TCACCATCAC TCCCCCOATC ACGCCACAAA 
961 CCCTCCCATA CCCCCTTACA AACAATAGCG AGGC LULll CCTATCCAAA CTTACCCATA CAGTAAAAGG ACAACCCCTA lll-l ICCLCU TCTCCACCTA TATCCCOCCC ACCATATCCC 
101 1 ATCAGATGAC CGGCaTAATC CCCA C CCATA TCTCACCTCA CCATCCACAA AAACTTCTCG TTGCOCTCAA CCACCCAATC CTCATTAACC GTAAGACTAA rAflgAACACT AATACCATOC 
1201 AAAATTACCT TCTCCCAATC ATTGCACAAC GCTTCAGCAA ATGOGCCAAC CAC CC CAAAO AAGATCTTCA CAATCAAAAA atcctcocca ccacagaccg CAAOCTTACA TATOOCTCCT 
ini TOTGCGCGTT TCGCACTAAC AAAGTCCaCT ccttctatcg cccacctcca A C CCAGA C CA TCCTAAAACT CCCACCCTCTTTTAGCGCTT tccccatctc atccotatoo actacctctt 
1441 TCCCCATCTC CCTCACOC A O AAGATCAAAT TGGCATTACA A CC AAAGAAO GaGGAAAAAC TGCTGCAAGT CCCGGAGGAA TTACTTATGO *g?rcVifflr IM-UlUi ftC CATOCTCACC 
1361 AGGAATCCAO AOCCCACAAO CTCCCaGAAG CACTCCCACC ATTAGTGCCA GaCAAAGCTA TCCACCCAGC TGCGGAAGTT GTCTGCCAAG TCOAGCCCCr CCAGCCGGAC accgcagcac 
1611 CACTCCTCCA AACCCCGCCC CCTCATCTAA GCATAATACC TCAAGCAAAT GaCCOTATCA TCGGACACTA TaTCCTTCTC TCCCCCATCT CTCTCCTCAA GAACCCTAAA I HAl A CC M ) 
1601 CACACCCCCT AGCAQA CC A O GTTAACaTCA TAACGCACTC CG GAACATCA GCAAGGTATG CACTCGAACC ATACCACGCT AAAGTACTGA TCCCAGCAGO AACTOCCGTA ccatggccag 
tm AATTCTTAGC ACTGAGTCAG ACCCCCACCC TTCTCTACAA CGAAAGaCAG TTTCTCAACC GCAAGCTCTA CCATATTCCC ATGCACGGTC CCGCTAAGAA TACAGAACAO OAGCAGTACA 

imi acgttacaaa gccagag c tc ccacaaacao actacctctt TCACOTCCAC aagaagcgat gcgttaacaa ccaagaagcc tcacgacttg tcctttcccc agaactgacc ftflrcfrrc c i 
31*1 atcacgaact acctcttcag gcactcaaca ctccaccccc cctcccctac aaggttgaaa caatagcact catacgcaca ccaccatcco gcaagtcagc tatcatcaag tcaactctca 
fill CCGCACCTCA TCTTCTTACC AC C GGAAAOA aaCAAAACTC CCCCCAAATT GAGGCCCACG TCCTACCCCr cacgggcato cacatcacct ccaacacagt ggattcogtt atcctcaacg 
J«l GATOCCACAA AGCCGTAGAA CTCCTCTATC TTCA C GAA CC CTTCCCGTGC C A C CCaGCaC CaCTACTTGC CTTCATTCCA ATCGTCACAC CCCGTAAGAA CCTAOTACTA TGCGGAGACC 
inr CTAAGCAATG CCCATTCTTC AACATCATCC AACTAAACCT ACATTTCAAC CACCCTCAAA AACACATATO TACCAAOACA TTCTACAACT TTATCTCCCC ACCTTGCACA CAOCCACTCA 
2641 CGGCTATTOT ATCOACACTO CATTACCATO GAAAAATGAA AACCACAAAC CCCTGCAAGA AGAACATCCA AATCGACATT ACAGCCGCCA CCAAGCCOAA CCCAOCCGAC ATCATCCTGA 
2761 CATCTTTCCG cccctccctt aagcaa c tgc aaatccacta TC CCC C A CAT CACCTAATCA CACCCGCCCC CTCACAAGCG CTAACCAGAA AAGGACTATA TGCCOTCCGO CAAAAACTCA 
2U1 ATGAAAACCC gctgtacccg ATCACATCAC AGCATGTGAA CCTCTTOCTC ACCCGCACTG AGCACACCCT ACTATGOAAA actttacagg OCGACCCATC CATTAAOCAG CTCACTAACC 
3001 TACCTAAAGG AAATTTTCaG OCCACCATCO AGGACTCCGA AGCTGAACAC AAGGGAATAA TTGCTOCGAT AAACAGTCCC CCTCCCCCTA ccaatccott CAOCTCCAAO actaacgttt 
3121 gctgggccaa accactgcaa ccgatactgo ccacccclu* tatcctactt ACCCCTTCCC actccaccca GCTGTTCCCA CACTTTCCCC ATGACAAACC acactcgccc atctacccct 

3241 TACACCTAAT TTGCATTAAC TTTTTCCGCA TCGACTTGaC AAUHiU. It TTTTCCAAAC AGACCATCCC CTTAACCTAC CATCCTCCCO ACTCAOCGAO CCCACTAOCT CATTOOGACA 
3361 ACACCCCACC AACA C OCAAO TATGGCTACG ATCACGCCGT TCCCCCCCAA CTCTCCCCTA CATTTCCOCT GTTCCAOCTA GCTGGOAAAG ccacacaoct tcatttccao actctcacaa 
3*ii CTAGAGTTAT CTCTOCACAG CATAACTTGG tcccactcaa ccccaatctc cctcaccccttactccccca GCACAACCAO AAACAACCCO cccccctcca AAAATTCTTO acccagttca 

3601 AACACCACTC CCTACTTCTC ATCTCACAGA AAAAAATTCA ACCTCCCCAC AACAGAATCO AATCCATCCC CCCGaTTGCC ATAGCCCCCC CaCATAAGAA CTACAACCTG UL1HLUAJ 
TTlt TTCCGCCGCA CCCA CC CTAC CACCXCCTCT tcatcaatat tccaactaaa t acacaaa cc atcactttca acagtgcgaa CA CC AC GCGO CCACCTTCAA AACeeTTTCC CCTTCCCCCC 
3WI TCAACTCCCT TAACCCCCCA GC fi A CLLILU TCCTCAACTC CTXCGCTTAC O CC GA CCC CA ATACTGAGGA CCTACTCACC CCTCTTCCCA GAAAATTTCT CACACTCTCr wrAW^A^f 
3961 CaGACTCCGT CTCAACCAAT ACaGAAATCT ACCTGATTTT CCCACAACTA GaCaaCACCC CCACA C GaCA ATTCACCCCC CATCATTTGA ATTCTCTCAT IIIUICU I II TACCACOCTA 
40,1 CAACACACGG ACTTGCAGCC CCaCCCTCCT ACCCTaCTAA AaCGCAOAAC ATTCCTCATT GTCAAGAGCA AGCACTTCTC AATGCAGCCA ATCCACTCCC CACAFTMViA CAACCACTCT 
<ttl CCCCTCCCAT CTATAAACCT 1UA.CU AACA CTTTCACCCA TTCAGCCaCA GAGACAGCTA CCGCAAAACT CaCTCTCTGC CAACGAAACa AACTCATCCA UIUU1UUJ CCTGATTTCC 
«13l CCAAACA C CC * n *CCCAGAA GCCCTCAAAT TCCTGCAAAA CGCCTACCAT GCAGTGOCAO ACTTACTAAA TCAACATAAT atcaactctc TCGCCATCCC ACTGCTATCT acacccattt 
***' ACGCAGCCGG AAAACACCCC cttcacctat cacttaacto CTTCACAaCC gccctagaca caactcatgc gcacgtaacc atctactccc tgoataagaa gtcgaagcaa AOAATCCACC 
4561 cgctcctcca acttaaggag tctctaactc acctgaagca tcagcatatg gagatccacg accagttact atcgatccat ccccacactt GCCTCAAGGG AAGAAAGCCA TTCAGTACTA 
*6ii ca.aaagcaaa cttgtattcc tactttcaag gcaccaaatt ccat c aagca ccaaaacata tcccccacat aaagctcctg ttcccaaatg accagcaaao caacgaacaa cuiiutcci 
*»i acatattccg ggacaccatc caaccaatcc ccgaaaaatg cccoctccac cacaacccct cctctagccc gccaaaaacg ctcccctccc tctctatcta tgccatgacg ccagaaagcc 
«ni tccacagact cacaagcaat aacctcaaag aacttacact atcctcctcc accccccttc CAAACTACAA AATCAAGAAT CTTCACAACC ttcactccac aaaagtaotc ctctttaacc 
3041 CGCaTaCCCC CCCATTCCTT CCCCCCCCTA aCTACATAGA aCCACCaCAA CACCCTGCAC CTCCCCCTCC ACACCCCCaO GAGGCCCCCC gacttctagc cacaccaaca ccacctccac 
S16I CTGATaaCAC CTCCCTTCAT CTCACCCaCA TCTCACTGCA CATCCAACaC"aCTAGCCAAG GCTCACTCTT TTCCaCCTTT AGCCCaTCGG ACAACTaCCG AAGGCaCOTO U1ULILU.1U 
nti acgtccatcc CCTCCAAGAG CCTCCCCCTC TTCCACCCCC AaGCCTAAAC AACaTOGCCC CCCTCCCaGC GCCAACAATG CaCGAACAGC CAACTCCACC GCCAACCACC ACCTCTCCCG 
S40I accactccct TCaCCTTTCT TTTCATCGCG TATCTATATC CTTCGGATCC CTTTTCCACC CaCaCATGCC CCCCTTCCCA CCCGCACAAC CCCCCCCAAC tacatgccct ACCCATCTGC 
1351 ctatctcttt CCCATCCTTT T CCC A CC CaC acattgacca GTTCAGCCGC AGAGTAACCG AGTCOCAGCC CUCCIbUl occtcatttg AACCCCGCCA ACTCAACTCA ATTATATCCT 
M4I CCCCATCAGC CCTATCTTTT CCACCaCGCA A G CA C ACA C G TAGACGCACC AGTAGnAnCA CCCAATACTC TCTAACCCCC CTACCTCCCT ACATATTTTC CaCCGACACA CCCCCTCGGC 
SHt ACTTCCAAAA CAACTCCCTT CTCCACAaCC AGCTTACACA ACCCACCTTC CAC CC CAATC MUU. AAA C AATCTaCCCC UXULCICL ACACCTCGAA ACAGCAACAO ctcaaactca 
«tl GGTACCACAT CATCCCCACC GAAGCCAACA AAAC C AGGTA CCAGTCTCGA AAACTAGAAA « C CaGAAAO C CATAACCACT CACCGaCTCC TTTCA CCCCT ACCOCTCTAT AACTCTOCCA 
6Q» CACATCAGCC ACAATOCTAT AAGATCACCT ACCCCAAACC ATCCTATTCC AGCACTCTAC CACCCAACTA CTCTCACCCA AACTTTCCTG TAGCTCTTTO TaaCAACTAT CTCCATGACA 
6121 ATTACCCCAC CCTAGCATCT TaTCACATCA CCCACGaGTA CCATCCTTAC TTCCaTATCC TACaCGCGAC aCTCCCTTCC CTaGATACTG CAACTTTTTO CCCCCCCAAC CTTACAAGTT 

6U1 accccaaaac acacoagtat » r ^ ficccc AA ACATCCCCAO tccccttcca tcagccatgc agaacacctt GCAAAACCTC CTCATTCCCC CCACTAAAAO aaactccaac ctcacacaaa 
6361 TCCCTCAACT C CC AACACTC cactcaccca cattcaacct TGAA1U.HI CCAAAATATC catccaatca ccactattcc caccactttc ccccaaag cc aattagcatc ACTACTCAGT 
641 TCCTTACCCC ATACCTOGCC ACACTCAAAO CCCCTAACCC CCCCGCACTG TTCCCAAACA COCATAATTT CCTCCCATTO CAaCAACTCC ctatccatao attcctcatc CACaTCAAAA 

6601 gacacgtcaa acttacacct cccaccaaac acacagaaga aaca ccc aaa ctacaagtca tacaagccgc agaaccccttj gcga ccu. ii acctatgcoo catccaccgo gacttaotgc 
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6711 CCAOCCTTAC ACCCCTTTTC CTACCCAACA TTCACACGCT CTTTCaCATB T O ' ^.UL aG G ACTTTCATCC AATCATAGCA CaaCACTTCA ACCAAGCTCA CCCCGTACTC CAGACaOATA 
6MI TCGCCTCGTT CGACAAAAGC CAACA C CACO CTATCOCCTT AA f.CUU.LtU ATGATCTTCG AaGaCCTGGG TCTCCaCCAA CCaCTACTCO ACTTCATCGA ULUXLII F GCACaaaTAT 
«ftl CATCCACCCA TCICCCCACC CCTA CC CCTT TCAAATTCCC GCCCaTGATO AAATCCGGAA TCTTCCTCAC LL1H11HL AACaCACTTC TCAATGTCCT TATCGCCAGC AGAGTATTGG 
7MI AOCA CCCOC T TAAAACCTCC AAATCTOCAG CATTTATCCC C CA r fiACAA C ATTATACACC cactactatc tcacaaacaa atggogaca gctctcccac CTCOCTCAAC ATCCACCTTA 
7201 ACATCATTCA CGCaCTCATC GQCGAGAGAC CACCTTACTT CTGCCCTOCA TTCATCTTCC AAGATTCCCT TACCTCCACA LLL1U1LU-L TCOCCCACCC CTTCAAAAOO CTCTTTAACT 
mt TCCCTAAACC OCTCCCACCC CACCATCACC AAGaCCAAGA rACAACA C OC GCTCTOCTAC ATCAAACAaA CCCCTCGTTT AGAGTAGGTA TAACAOACAC CTTACCACTO "■^■-■■AJtMA 
T44| CTCCGTATCA GCTAGACAAC ATCACACCTC TCCTOCTOGC ATTGACAACT TTTQCCCACA CCAAAACACC ATTTCAAGCC ATCACAOSCC AAATAAAOCA TCTCTACGCT GCTCCTAAAT 
TM1 ACTCACCATA CTACATTTCA TCTCACTAAT ACCACAACAC CACCACCATC AATACACCAT TCTTTAACAT CCTCCCCCOC LLU.LL1 ILL «**TC»ie TGCCATGTCO AffOP WTmrt 
7&11 CAACGACCCA CCC CCCC CCC ATOCCTCCCC GCAATCCCCT LLL1 1U.LA A ATCCACCAAC TCAC C ACaC C CCTCACTCCC CTaCTCATTO OACAOCCAAC TACACCTCAA i^f ffT 
T»l CACCCCCCCC CCCCCCCCAC AACAACCaGO CCC CAAACTA AtT ht XC MO C C CAACAAA C C AAAAACACA GGACAACAAO AAOAACCAAC CTGCAAAACC cwrcencft A^T M^ec 
7921 CTATOCCACT TAACTTGGAG CC C CACACa C tcttcgacct CAAAAATOAC GACGCACATG TCATCGGGCA cccactcocc atgoaaggaa aggtaatgaa accactccac GTCAAACCAA 
*9*1 CTATTOA CC A LLLILIU.1A TCAAAC C TCA AATTCACCAA CTCOTCAOCA TACCACATCB ACTTCCCACA LUUXLL1L AACATOAOAA CTOA bLUJll CACCTACACC ACTGAACACC 
11*1 CTCAACSCTT CTACAACTCC CA e CACCCAC CCCTCCAGTA TAGTOOACGC ACATTTACCA TCCCCCOCOG ACTaCOACCC ACaGQaGaCA LILLILULL oattatcoat aactcaoocc 
OSI CCCTTUTCCC CATAUILLIL CCAGUJA, IU ATCaGCCAAC AAGAACCGCC CTTTCOCTCO TCaCCTCCAA TaGCAAACGG AACaCAATCA ACACAACCCC CQAAOOOACA CAAGACTCCT 
Ml CTGCTCCaCC ACTOCTCACC CCCAIULLI TCCTTOCAAA CCTCAGCTTC CCATCCAATC CCCCCCCCAC ATCCTACACC COCOAACCaT CC ACA U-IL1 CC AC A T CC T C nAAflAOAACC 
mi TCAA C CACCA CCCCTACCAC ACCCTCCTCA ACCCCATATT CCQCTOCCCA TCCTCCCCCA GAaGTAAAAG AAGCOTCaCT CACCACTTTA ccttgaccag cclc t a ctto ggcacatgct 
MAI CCTACTT7TCA CCATACTCAA CCCTDCTTTA GCCCOaTTAA CAT C CAOCaO CICIUUMTC AAO C C C A C GA CAaCaCCATA COCATaCACA LULL LULL A CTTTOOATAC ^""'"^ 
n«l CAOCACCAAG CTCAAATAAC T ACCCC T A CA TCTCCCTCCA OCAGCATCAT ACTUTCAAAO AACCCACCAT CCATCACATC AAOATCAGCA CCTCACCACC CTCTAOAACO CTTAOCTACA 
ttll AACCaTACTT TCTCCTCCCC AAGTCTCCTC CACCCCACAO CCTAAUAJII AOCATACCCA GTAGCAACTC AOCAACCTCA TCCACAATOO CCC CC AA C AT iViiMTrtftM TTCCTCOCAC 
9001 CCCAAAAATA TCACCTACCT CCCCTTCACG CTAAQAACA T ILL1 IUL ACA CTCTAOCACC CTCTCAAAGA AACAACCCCC CCCTACATCA CTATGCACAC CCCOGGACCG CACOCCTATA 
t 111 CATCCTATCT CCACCAATCA TCAOCCAAAG TTTA C C C CAA GCCACCATCC OGGAAGAACA TTaCCTACCA CTOCAACTOC OOCOaTTACa ACAOCOCAAC CCTTACOACC CCTACCGAAA 
n<\ TCACGCGCTG CACCCCCATC AACCACTCCO TCCCCTATAA fiAOCBACCAA ACGAACTGGC TCTtCAACTC CCCOCACTCO ATCACACAC O CCOACCACAC TOTTAAffiW AAATTOCATT 
9361 TCCCTTTCAA GCTCATCCCG ACTACCTCCA IUL1LLL1H IGIXCACOC C CCC AA CC TAC TACACCCCTr TAAACACATC AGCCTGCAAT TACACACACA CCATCTGACA IIULILA CCA 
9«irCCACCACACT A CCCCCAAAC ccocaaccaa ccactcaatc oatcatccca AACACOCTTA CAAACTTCAC cctccaccca cat ccc ctco AATACATATC ccccaatcac caaccactaa 
9691 CCCTCTATCC CCAACACTCT C C A C CACCAC ALLL1LACCC ATOCCCACAC CAAATACTAC ACCATTACTA TCATCCCCaT CCTCTGTACa CCATCTTACC CCTCCCATCA CCTOHHUi 
Vni CCATCATCAT TCCCCTAACT CTTCCACCaT TAIU1ULLIL T AAA CCCCGC CCTCACTCCC TCACCCCATA IU.LLILUX CCAAATCCCC TCATTCCAAC TTCCCTCCCA LHHU I U. I 
9U\ CTCTTACCTC CCCTAATCCT CAAACATTCA CCCACACCAT CAGTTACTTA TCCTCGAACA CCCACCCCTT L I IL1ULUC CACCTOTCTA TACCTCT66C LLHUILUIL CnCTAATCC 
9WI UL1LUU.11. ATCCTCCCTC CCTTTTTTAC TCCTTCCCCC CCCLIACCTC CCCAACCTAC ACCCCTACCA ACATCCCACC AL1LI 1LLAA ATCTGCCACA GATACCCTAT AACCCACTTC 
I COI1 TTCAAACCCC ACCCTACCCC CCCCrCAATr TCCACATTAC TCTCATCTCC TCCCACCTTT TCCCTTCCAC CAACCAACAO TACATTACCT CCAAATTCAC CACTCTCCrC CCCTCCCCU 
10301 AACTCACATO CTCCCCCTCC TTCCAATCTC' ACCCCCCCCC TCACCCACaC TATACCTOCA AGCTCTTTCC ACCGOTCTAC LLLI 1LA TCT C CCCACCACC ACAATCTTTT TCCOACACTG 
1Q3H ACAACAC CC A CATCACtCAC CCCTACCTCC AATR7TCACT ACATTSCOCC ACTCACCACG CCCACGCCAT TAACCTCCAT ACTCCCCCCA TCAAACTACO ACTOCCTATA CTCTACCCCA 
10*41 ACACTACCAC TTTCCTACAT CTCTACCTCa ACCCACTCAC ACCACCAACC TC T AAACA C C TCAAACTCaT ACCTOCACCA ATTTCACCAT TCTTrACACC ATTCCATCAC AAOCTCCTTA 
I0M1 TCAATCCCCC CCTOCTCTAC AACTATQACT TPCCCCAATA CCGACCGATD AAACCAGGAO CCTTTCCAGA CATTCAACCT ACCTCCTTCA CTAOCAAACA CCTCATCCCC AGCACACACA 
tOMl TTAGCCTACT CAAC LLULC C CC AACAA C C TSCATCTCCC CTACACCCAC OCCOCATCTC CATTCCACAT CTCOAAAAAC AACTCACCCC CCCCACTCCA COAAACCCCC LLHHULLl 
KM0I CCAACATTCC ACTCAATCCC CTT C CAC C CO TCCACTCCTC ATACCCCAAC ATTCCCATTT CTATTCACAT CCCCAACCCT CCC7TTATCA CCACATCAOA tOCACCACTO OTCTCAACAO 
J 091 1 TCAAATCTOA TCTCACTCAC TCCACTTATT CACCCCACTT CCCACCCATB CCTACCCTCC ACTATCTATC CCACCCCCAA CCACAATCCC ctctacattc ocattcgacc acaccaaccc 
U«1 TCCAACACTC cacacttcat CTCCTCCaCA aacca lccc t cacactacac ttcaccaccc ccaccccaca gccgaacttc attctatccc tutctcctaa CAACA C AA C A tccaatocac 
ti i6i AATCCAAACC accacctcat catatcctca ccacccccca caaaaatcac caacaattcc aacccgccat ctcaaaaact tcatccactt ccctctttoc ccttttcccc canxciixr 

1 1211 CGCTATTAAT TaTaCCaCTT ATCATTTTTC CTTCCACCAT CaTCCTCACT ACCaCa C CaA CATCA CtCLI ACCCCCCAAT CaCCCCaCCA CCAAAA CT CO ATCTACTTCC CACGAACTCA 
1 1*01 TCTCCATAAT CCATCACCCT CCTATATTAC AIUJLLU.IJ A C CCCC CCCA ATATACCaaC ACCAAAACTC CaCCTATTTC CCACCAAOCC CACTCCATAA TCCroCCCAG IL1IU.LAAA 
IUZ1 TAATCACTAT ATTAACCATT tattcacccc ACCCCAAAAC tcaatctatt tctcaccaac catcctccat AATCCCATCC accctctcca taacttttta ttatttcttt TATTAATCAA 

11*41 CAAAATTTTC TTTTTAACAT TTC 
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Nucleotide Sequence of TR339 

I ATTOGCGGCG TAGTACACAC TATTCAATCA AA^GCCCAC CAATTCCACT ACCATCAeAA TCUCAACCC AGTACTAAAC CTaQACOTAG "*~rmn TCCCIIIliiC OTGCAACTCC 
Ul AAAAAAGCTT CCCCCAATTT CACWTACTAO CACAGCAGGT CACTOCAAAT CACCATCCTA ATOCCAOACC ATTTTCGCAT CTGGCCAGTA AACTAATCOA OCTCOACOTT CCTACCACAG 
241 COACCATCTT CCACATAOCC ACCCCACCCC CTCCTAOAAT CTTTTCCCAG CACCAGTATC A 1IU1UIUU CCCCATOCOT ACTCCACAAG *T*irtQCB CATOATOAAA TATOCCAGTA 

161 AACIOSCC6A AAAACCCTOC AACATTACAA ACAAOAACTT CCATCACAAG attaaggatc tccccaccct acttgataco ccgcatgctc aaacaccatc cctctocttt cacaaccatc 

41 I TTACCTOCAA CATBCOICCC CAATATTCCG TCATOCAGGA CCTOTATATC AACCCtOCCO OAACTATCTA TCATCAOGCT ATOAAAGGCO TCCGGACCCT OTACTCCATT OQCTTOUCA 
*>1 CCACCCACTT CATOTTCTCC GCTATGGCAG GTTCGTACCC TSCCTACAAC ACCAACTGGO CCGACQAGAA ACTCCTTGAA GCGOQTAACA TCCCACTTTC CAOCACAAAO CTOAQTOAAO 
m OTAOCACACO AAAATTUTCO ATAATQAOGA ACAACGACTT OAAG CCCC GO ILULUUilll ATTTCTCCCT ACOATCOACA CTTTATCCAO CAOCITOCAO AOCTOOCATC 

W I TTCCATC OOT CTTCCACTTO AATGGAAAGC ACTCGTACAC n COXClUI OATACACTGQ TBAOTTOCOA ACOCTACOTA GTGAAGAAAA TCACCATCAO TCCCCCCATC a-w^^ 
»l CCCTCGGATA CCCCCTTACA CACAATACCO ACCCCTTCTT CCTATOCAAA CTTACTGACA CACTAAAACO ACAACO00TA T CCriOUU TOTCCACCTA CATCCCCOCC ACCATATOCC 
tMI ATCAGATOAC TCCTATAATO CCCACCCATA TATCACCTQA CCATCCACAA AAACTfCTCC TTOCCCTCAA CCACCGAATT OTCATTAACG CTAOOACTAA AT ICt AACACCATOC 

iwi aaaattacct tctocccatc ataocacaag cottcaccaa atccoctaao oao coc aao o ATOATCTTCA TAACCAQAAA ATOCTOOCTA CTAOACAACO CAAOCTTACO TATOOLIU.T 
ml TCTCOCCUTT TCCCACTAAC AAACTACATT CCTTTTATCO CCCACCTCOA A CCC ACACCA TOTAAAAOT CCCACCCTCT TTTAOCOCTT TTCCCATOTC OTCCOTATCO ACCACCTCTT 
l«l TCCCCATOTC CCTCACCCAC AAATTQAAAC TCCCATTCCA A C CAAACAAO GA O flAAAAAC TCCTGCACCT CTCOGAGGAA TTAOTCATOO ^r. Mff? F TCCTTTTOAO CATOCTCACO 
I* I ACCAACCCAO ACCCCACAAO CTCCQACAAO CACTTCCACC ATTACTCCCA CACAAACOCA TCOACCCACC CGCACAAGTT GTCTUCCAAO TOOAOOCOCT CCACCCCOAC ATCGOAOCAO 
Mil CATTACTTOA AACCCCCCCC CCTCACCTAA CGATAATACC TCAACCAAAT CACCCTATGA TCCOACACTA TATCCTTCTC TCCCCAAACT CTCTCCTOAA OAATOCCAAA CTtOCACCAO 
1101 CCCACCCCCT ACCAGATCAG CTTAACATCA TAACACACTC CCOTACATCA COAACCTACC CCCTCGAACC ATaCCACCCT AAACTACTCA TCCCAOCACG AOCTCCCCTA CCATCCCCAO 
Ml AATTCCTACC ACTGAGTCAG AOCGCCACGT TACTCTACAA CC AAACACAO TTTUICAACC CCAAACTATA CCACATTCCC ATCCATCCCC CCGCCAAGAA TACACAAQAfi CAOCAOTACA 
»«l ACCTTACAAA CCCACACCTT CCACAAACaO AGTACGTCTT TGACGTCGAC AACAACCOTT CCOTTAACAA CGAACAAOCC TCAOCTCTCa lUIULUUi AGAACTOACC AACCCTCCCT 
2t*I ATCATOACCT ACCTCTOOAa CCACTCAACA CCCCACCTCC COTCCCCTAC AAOORXAAA CAATAGOACT CATACCCACA C CCOCC TC OO GCAAOTCAGC TATTATCAAO TCAACTCrCA 
221 1 CCGCACOCOA TCTTCTTACC AOCOOAAAGA AAOAAAATTO TCOCOAAATT CAOOCCSACO TCCTAACACT GAGGOOTATO CAOATTACCT COAAOACACT AG AHLU.I1 ATOCTCAACG 
240! GATGCCACAA ACCCCTACAA CTGCTOTACC TTGACOAAOC UU.U.UU. CACCCACOAC CACTACTTCC CTTGATTCCT ATCCTCACGC CCCGCAAGAA CGTAOTACTA TCCOGAGACC 
212 1 CCATOCAAT0 CGCATTCTTC AACATGATCC AACTAAACOT ACATTTCAAT CACCCTGAAA AAGACATATG CACCAAOACA TTCTACAAGT ATATCTCCCO CCCTTOCACA CAOCCACTTA 
2*1 CAGCTATTGT ATCCACACTO CATTACCATO OAAAGATOAA AACCACOAAC CCCTOCAACA ACAACATTOA AATCOATATT ^m O OJ i -yAfTO AA OCCAGCGOAT ATCATCCTCA 
n« CATOTTTCCC CCGCTGGCTT AACCAATTOC AAATCCACTA TCCCGSACAT GAACTAATCA CAOCCGCOX CTCACAAGGG CTAACCAOAA AAGGACTQTA TO CC CT CCOU CAAAAACTCA 
2111 ATGAAAACCC ACTOTACCCO ATCACATCAC AGCATOTOAA CC IC T I OUL ACCCCCACTO AGGACAOGCT ACTCTOAAA ACCTTGCACC CCCACCCATC GATTAACCAC CTCACTAACA 
»! TACCTAAAGO AAACTTTCAO OCTACTATAO AGGACTOGOA AGCTOAACAC AAGCGAATAA TTCCTOCAAT AAACACCCCC ACTCCCCOTO CCAATCCOTT CAOCTOCAAO ACCAACOTTT 
Ittl CCTGCGCOAA ACCATTCOAA CCGATACTAO CCACGGOCCG TATCOTACTT ACCGOTTCCC ACTCCACCGA ACTCTTCCCA CACTTTCCCG ATCACAAACC ACATPCCOCC ATTTACCCCT 
22*1 TACACCTAAT TTGCATTAAG TTnTCCCCA TCCACTTCAC AACCOOACTG TTTTCTAAAC AOAGCATCCC ACTAACCTAC C A TC CCGC C C ATTCACCCAG ccccctacct CATTGOQACA 
12*4 ArAO CC CACG AACCCGCAAG TATOOCTACO ATCACCCCAT lU'lU.LU AA CTCTCCCCTA GATTTCCCGT CTTCCACCTA GCTGOGAAGG GCACACAACT TGATTTOCAO ACOOOOAGAA 
Wl CCAGACTTAT CTCTCCACAO CATAACCTCC TCCCCCTGAA ccccaatctt OTCACGCCTTACTOCCCA OTACAAGCAG AACCAACCCC CCCCGCTCGA AAAATTCTTO AACCACTTCA 
3MI AACACCACT C AGTACTTCTO CTATCACAOO AAAAAATTCA agctccccct aagagaatcg AATCCATCGC CCCCATTCCC ATAOCCCOTO CAOATAAOAA CTACAACCTS GCTTTCCCCT 
2H! TTCGGCCOCA GGCACCGTAC GACCTGGTGT TCATCAACAT TCGAACTAAA TACAGAAACC ACCACTTTCA CCAGTGCGAA CACCATCCOC CGACCTTAAA AACCCTTTCO ccttcgoccc 
* 1 TCAA TTGCCT TAACCCACCA GCCACCCTCC TCGTOAACTC CTATOBCTAC GCCCACCGCA ACACTCACCA COTACTCACC CCTCTTGCCA CAAACTTTCT CAOCOTOTCC "~flTBMMC 

mi cagattgtct ctcaaccaat acagaaatct acctbatttt cccacaacta Oat aacac cc ctacacgoca attcacccco caccatctoa attocotoat n c tn cc o w TATCAOOOTA 

*01l CAaOAGATGO AGTTOQAOCC OCGCCCTCAT A CCCC A CC AA AACGOACAA T ATTOCTGACT CTCAACAOGA ACCACTTGTC M*rr^CTT!A ATOCCCTGGO tagaccagcc gaaocactct 
*20 1 CCCCTGCCAT CTATAAACCT TCCCCCACCA CTTTTACCCA TTCAOCCACC flAfiA C AOGCA CCGCAACAAT (UCTCTCTCC CTACGAAACA AACTOATCCA CCCGCTCOGC CCTOATTTCC 
«2I CGAAGCACCC AGAAGCAGAA CCCTTCAAAT TCCTACAAAA CCCCTACCAT GCAGTOOCAG ACTTAGTAAA TGAACATAAC ATCAAGTCTG TCCCCATTCC ACTOCTATCT ACAOCCATTT 

ACGCAOCCCG AAAAGACCCC CTTGAAGTAT CACTTAACTC CTTGACAACC GCGCTAGACA CAACTOaCGC CCACCTAACC ATCTATTCCC TCGATAAOAA CTCCAACCAA AGAATCCACC 
* MJ CCCCACTCCA ACTTAACCAO TCTGTAACAG AGCTCAACCA TCAACATATC GAOaTCCACG ATGACTTACT ATCGATCCAT CCAGaCACTT CCTTGAACGC A A<TM><?PPA TTCACTACTA 
««l CAA AAGCAAA ATTCTATTCC TACTTCCAAO CCACCAAATT CCATCAAGCA CCAAAACACA TGGCCGAGAT AAAGCTCCTO TTCCCTAATG ACCAGCAAAG TAATCAACAA ctctctccct 
4101 ACATATTCCO TCACaCCATC CAACCAATCC GCCAAAAOTC CCTCCTCCAC CATAACCCCT CCTCTAGCCC CCCCAAAACC TTCCCCTGCC TTTCCATCTA TCCCaTGACO IT AAGCO 
<WI TCCACACACT TAGAACCAAT AACOTCAAAO AACTTACACT A T CC TCCT CC ACCCCCCTTC CTAAGCACAA AATTAAGAAT CTTCAGAAGO nCACTCCAC GAAACTACTC CTCTTTAATC 
*»l CCCACACTCC CCCATTCCTT CCCCCCCCTA ACTACATACA ACTGCCAGAA CACCCTACCC CTCCTCCTCC ACACCCCCAC CACCCCCCCC AACTTPTACC GaCaCCCTCA CCATCTACAC 
SUI CTCATAACAC CTCCCTTCATCTCACACACA TCTCACTCCA TATCCATCAC ACTACCGAAO CCTCACTTTT TTCCAGCTTT ACCCOATCCC ACAACTCTATTACTAOTATO CACAorreoT 
mi CGTCACGACC tacttcacta cagatagtao accgaagcca CCICCIGU I U CCTGACCTTC ATCCCOTCCA AGACCCTCCC CCTATTCCAC CCCCAAGCCT AAAOAACATO ccccccctcc 
5401 CA nCCGC AAO AA'AfiMC CCC ACTCCACCCO CAACCAATAG CTCTCACTCC CTCCACCTCT CTTTTGCTGG GCTATCCATO TCCCTCCCAT CAATTTTCGA CGGAOACACO CCCeOCCACC 
SMI CAGCCCTACA ACCCCTOCCA ACAGOCCCCA CCCATOTGCC TaTCTCTTTC OGaTCOTTTT (XGACCGAGA CATTCATGAG CTGAOCCCCA CACTAACTGA otccoaaccc cr cciu i im 

GATCATTTOA ACCCGGCCAA CTGAACTCAA TTATATCCTC CCGATCAGCC OTATCTTTTC CACTACCCAa CCaCaCaCCT ACaCGCaGCA cr*TOM?q*r TCaaTaCTGA CTAACCGCCG 
SHI TACCTCCGTA CATATTTTCG accgacacao cccctcccca CTTCCAAAAG AACTCCCTTC TCCAGAACCA OCTTACAGAA CCCACCTTCO ACCGCAATCT CCTGGAAACA ATTCATDCCC 
SUI CCOTOCTCCA CACCTCCAAA fiA O OAACAAC TCAAACTCAO otaccacato ATCCCCACCG AAGCCAACAA AACTACCTAC CACTCTCCTA AACTACAAAA TCAOAAACCC ATAACCACTC 
•001 ACCQACrACT ctcagoacta cqactctata actctcccac agatcaccca OAATCCTATA AOATCACCTA TCCCAAACCA TTCTACICCA CTAGCCTACC CGCCAACTAC TCCGATCCAC 
*HI AGTTCCCTCT AGCTCTCTOT AACAACTATC TCCATGACAA CTATCCCACA CTACCATCTT ATCACaTTAC TOaCGACTAC CATOCTTACT TGGATATOCT AOACCCCACA GT CCCC10U. 
•241 TCGATACTCC AACCTTCTCC CCCGCTAACC TTACAACTTA CCCO AAAAAA CATBAGTATA GAGCCCCCAA TATCCCCAGT OCCCTTCCAT CACCOATCCA CAACACCCTA CAAAATCTOC 
Oil TCATT6CCCC AACTAAAACA AATTOCAACO TCACOCAGAT CCOTGAACTO CCAACACTCG ACTCAGCCAC ATTCAATCTC CAATCCTTTC OAAAATATCC ATCTAATCAC GAOTATTOCG 
*«l ACOACTTCCC TCCGAAGCCA ATTAGGATTA CCACTGAGTT TUtCACCCCA TATCTAOCTA GACTGAAAGG CCCTAAGGCC CCCGCACTAT TTCCAAAGAC CTATAATTTO CTCCCATTGC 

AAGAAOTCCC TATOCATACA TTCCTCATCO ACATCAAAAO ACACCTCAAA GTTACACCAO GCACOAAACA CACACAAGAA ACACCCAAAG TACAACTCAT ACAAOCCCCA GAACCCCTCG 
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4711 CCACTCCTTA CTT A T OCGCO ATTCACCGOC AATTACTGCO TAGGCTTACG UtLUlLUUL TTCCAAACAT TCACACCCTT TTTOACATOT fwwr^/^ ttTTGATOCA ATCATAGCAC 
Oil AACACTTCAA GCAAOOCCAC CCOOTACTGO AGACGGATAT COCATCATTC OACAAAAGCC AACACOACOC TATOOCOTTA ACCOCRCTGA TQATCTTBQA GGACCTDOOT OTOOATCAAC 
49*1 CACTACTCGA CTTOATCGAO IULUJ.U1U GACAAATATC ATCCACOCAT CTACCTACCC CTACTCOTTT TAAATTCOCC GCOATOATOA AATOTGAAT GTTCCICACA CTTTTTOTCA 
m ACACAOTTTT CAATCTCCTT ATCOCCACCA GACTACTAGA Afi A BC OC C TT AAAAOOTCCA OATOTOCAOC OTTCATTCCC GACOACAACA TGATACAIGO AOTAOTATCT GAjCAAAGAAA 
T»l TCOCTCACAO UHI.U.L A CC TGOCTCAACA TGGACCTTAA CATCATCOAC OCAOTCATCO OTQAQAQACC accttacitc toooooooat TTATCTTCCA AO AHLUUH ACT1CCACAO 
7111 COTCCCCCOT COCOOACCCC CnSAAAACOC TOTTTAACTT CCCTAAAOCO CTCCCAOCCO ACCAOOACCA AOACCAACAC AGAABACCCO CTCTQCIAQA TOAAACAAAO OCCTOOTTTA 
7441 CACTACOTAT AACAOOCA C T TTAGCACTCO CUiUMffi A T LUXJIATGAO OTAOACAATA TTACACCTCT CCTACRJOCA TTCACAACTT TTOCCCAGAG *' A **.*?*gC* TTCCAAOCCA 
7341 TCAGAGGGGA AATAAAGCAT CTCTACOOTO GTCCTAAATA OTCACCATAO TACATTTCAT CTCACTAATA CTACAACACC ACCACCATCA ATAGAGQATT CTTTAACATQ ""j i iji; 
7611 O CCCC1 1 IXC COCCCCC A C T GCCAlTOQA MXWUM AACGAOOCA O CTrVftXaU lUXIUUXU CAACOQOCra GCTTCTCAAA ICCAGCAACT OJKTACAOOC OTCAGTOCCC 
TBI TAOTCATTCO ACAGOCAACT AQACCfCAAC CCCCACOTCC A CnC CC OICA CCCC O CC A O A AOAAOCAOGC GCCOAOTAA CC ATCfi A AGC CGAAOAAACC AAAAACCCAO tifft*" t VitK 
mi AGAAGCAACC TSCAAAACCC AAACGCGGAA AGAOA C AO CO CATOOCACTT AAOTTQaAGO CCGACAQATT CTTCGACCTC AACAACGAGO ACCOAGATCrr CATOCGQCAC GCACTOOCCA 

KXI TGGAAGGAAA GGTAATOAAA cctctgcaco tcaaaggaac CATCGa C CA C cctctoctat caaagcicaa ATTTACCAAG T^GTCAGCAT acgacatgoa ottcgcacao ttqccagtca 

1141 ACAT0A6AAO TGAGGCATTC ACCTACACCA OTOAACA CCC CGAAGGATTC TATAACTOGC ACCACQgAOC GGTGCAGTAT AGTGGAGGTA QATTTACCAT CCCTCGOOGA GTAGGAOGGA 
Otl OAGOAOACAO CUiiLUlLUi ATCATGGATA ACTCCCCTCG OOTTOTOGOG ATACTCCTCO GTOGAGCTOA TOAAGOAACA CGAACIGCCC 1 1 1LUUIU3 I CACCTGGAAT AOTAAAGGGA 
S4M AGACAATTAA GACOACCCCO GAAGGCACAO AAGAOTGOTC CO CAOCA CC A CTBOTCACGO CAATUTU7TT GCICCGAAAT OTGAGCTTOC CATGCGACCO CCCGCCCACA TQCTATACCC 
1521 OCGAACCTTC HMMU.U IL CACATCCTTO AACAOAA CCT OAACCATOAO occtacgata CCCTGCTCAA TGCCATATTO CGGTGCGGAT OCTC1CGCAG aa g caaaaca agcotcacto 
U41 ACQACTTTAC CCTGACCAGC CCCTACTTCO GCACA7GCTC OTACIGCCAC CATACIGAAC CGTOCTTCAG CCCTBTTAAG ATCGAGCAGG TCTOCOACOA AGCGGACGAT AACACCATAC 
1741 OCATACAGAC 1 ILOJLUA O TTTCGATACO A CC AMC CO O AC C AO C AA OC OCAAACAA GT ACCOCTACAT GTCOCTTGAG CAGGATCACA CCGTTAAAGA AGOCACCATO GATQACATCA 
«UI AGATTAGCAC CTCAGOACCG TGTAGAAGGC TTAGCTACAA AGGATACTTT CICCTCOCAA AATCCCCTCC AOOOOATAWT GTAACGGTTA GCATAGTQAO TAGCAACTCA GCAACCTCAT 
KOI GTACACTCGC CCGCAAGATA AAACCAAAAT TCGTGGGACO OGAAAAATAT OATCTACCTC CCOTTCACGO TAAAAAAATT CCniiL ACAO TUTACOACCO TCTQAAAGAA ACAACTCCAO 
0121 OCTACATCAC TATGCACAGG CCOOGACfflT AOGCTTATAC ATOCTACCTO GAAOAATCAT CAGQGAAAGT TTACOCAAAfl CC0CCA7CTO OOAAOAACA T TAOGTATSAG TGCAAOTGCO 
9241 GCGACTACAA OACCOOAACC GTTTCGACCC GCACCGAAAT CACICCTTGC ACCOCCATCA AGCACTOCGT CGOCTATAAG AGCOACCAAA CGAAOTOOGT CnCAACTCA CCGGACTTOA 
9341 TCAGACATSA CGACCACAfO O CC CAAOOOA AATTGCATTT GOCTTTCAAO TTCAT CCCGA GTACCTCCAT GGTOCCTTJTT GCCCACGCGC CGAATOTAAT ACATOOCTTT AAACACATCA 
«ai OCCTCCAATT AGATACAGAC CACTTGACAT lUUlACC AC CAGQAOA CT A CGOOCAAACC CGGAACCAAC CACTSAATGG ATCOTCCOAA ACACGGTCAG AAACTTCACC GTCGAOCGAG 
9401 ATGGOCTCOA ATACATATGO GOAAATCATG AGCCACTOaO GCTCTATOCC CAAGAOTCAG CACCAGGAGA CCCTCACGGA TGGCCACACO AAATAGTACA OCATTACTAC C ATUXL ATC 
9711 CTOTBTACAC CATCTTACOC CICCCATCAG CTACCGTGCC GATOA1GATT GGCGTAAGCG TTCCACTOIT ATOTCCCTGT AM CCCCCCC. GTCAGTQCCT OACGCCATAC OLLLIUULLC 
9441 CAAACGCCGT AATCCCAACT TCGCTOGCAC 1LIIUIUC10 COTTAGOTOG GCCAATBCTO AAACOTTCAC CGAOACCATO AGTTAll 1UI GGTCOAACAG TCAGCOOTTC 1IUUUJU.C 
9PM AOTTOTGCAT ACCTTTGGCC GCTTTCATCG TTCTAATOCO CTOCTGCICC TKTQCCTCC CTTTTTTAGT GGTTQCCGGC GCCTACCTOO CO AACGTAGA CCCCTACGAA CATOCCACCA 

MOII ctottccaaa tgtgccacao ataccgtata AGGCACTTOT TOAAAGGGCA ggotatoocc coctcaattt ogagatcact gtcatgtcct cogaootttt occttccacc aaccaagaot 

tttm ACATTACCTG CAAATTCACC ACIGTGOTCC CCT CCC CAAA AATCAAATOC 1U.UULIUL I lUOAATOTCA U.LUU.UJT CATBCAGACT ATACCTCCAA GOTCTTCOOA OaOOTCTACC 
tOttl CCTTTATQTO OnGAOOACCG CAATOTTTTT GCGACAOTGA GAACAOCCAO ATOAOTOAGO CCTACOTCGA ACTOTCAGCA GATTGCGCOT CTOACCACGC GCAGGCGATT AAOOTQCACA 
10441 CTOCCGCGAT GAAAOTAGGA CTDCQTATAG TDTACGGGAA CACTACCACT TrCCTACATO 1GTACOTGAA CGGAGTCACA CCaGOAACOT CTAAAGACTT GAAAGTCATA OCTGGACCAA 
K04J TTTCAGCATC OTTTACOCCA TtCQATCATA AGGTCOTTAT CCATOOCGGC CTOOTOTACA ACTATGACTT CCCGGAATAT OOAGGGATGA AA CC A nO A WT OTTTOQAGAC ATTCAAGCTA 
U611 CCTCCTTQAC TAOCAAOGAT CTCATCGCCA GCACAGACAT TAOCCTACTC AAGCCTTCOG CCAAGAAOGT GCATGTCCCG TACACGCAGO CCGCATCAGO ATTTGAGATQ TOOAAAAACA 
Mttl ACTCAGGCCG CCCACTCCAG GAAACCCCAC U1IIUUUIU TAAGATTGCA GTAAATGCGC TCCGAGCGOT OOACTGTTCA T A C CGQAACA TTCCCATTTC TATTGACATC CCCAACOCTO 
10911 CCTTTATCAG GACAICAGAT OCACCACTGG TCTCAACAGT CAAATGTOAA OICAOTGAGT GCACTTATTC AGCAGACTTC GGCGGGATGO OCACCC1CCA GTATGTATCC OArmmAAO 
U041 OTCAATGCCC CGTACATTCG CATTCOAGCA CAGCAACTCT CCAAGAGTOO ACAGTACATG TOCTCGAOAA AOGAOCGOTO ACAGTACACT T T AOCA CCO C GAOTCCACAO GCGAACTTTA 
11141 TCGTATCCCT GTG70GGAAO aagacaacat GCAATCCAGA atotaaacca ccagctgacc ATATCOTGAG CACCCCCCAC AAAAATGACC AAOAATTTCA AGCCGCCATC T C AAAAA C A T 
Hill CATGGACTTG GCTUTTTCCC CTTTTCGGCG CLU.ULUL GCTATTAATT ATAGGACTTA TQATTTTTCC TTGCAGCATG ATGCTGACTA OCACA C OAAO ATGACCOCTA CGCCCCAATO 
ii4oi A7CCGACCAO caaaactcga tctacttccg aggaactgat GTGCATAATG CATCAGGCTG GTACATTACA TCCCCGCTTA CCGCGGGCAA TATAGCAACA CTAAAAACTC GATOTACTTC 
IIS1 C GAGGAACCO CACTGCATAA TGCTGOGCAG TGTTGCCACA TAACCACTAT ATTAACCATT TATCTAGCGO ACGCCAAAAA CTCAATGTAT TTCTGAGGAA GCGTGOTGCA TAATGCCACO 
11441 CAGCCTCTGC ATAACTTTTA TTATTTCTTT TATTAATCAA caaaattttg tttttaacat ttc 



